Nov1

generating excel files programmatically

Posted by: Brian Chan | Filed in: technology | Tags: | 05:32 pm, November 1st, 2011 No Comments »

I remember dealing with this problem a few years ago. At that time, I didn’t know better so ended up generating tab delimited files. A few years later, I finally gain my dignity back by generating real excel files.

I found many libraries out there, below are a few:

Ruby roo: http://roo.rubyforge.org/
Ruby spreadsheet: http://spreadsheet.rubyforge.org/
OpenXML: http://openxmldeveloper.org/
Apache POI (Java): http://poi.apache.org/index.html
Pear Spreadsheet_Excel_Writer (PHP): http://pear.php.net/manual/en/package.fileformats.spreadsheet-excel-writer.php
PHPExcel: http://phpexcel.codeplex.com/
Python Excel: http://www.python-excel.org/

I believe they will all work just fine as they all look promising.

I played with PHPExcel a little. It’s pretty good in what it does. If you are not going with crazy charts and formulas, PHPExcel should be more than enough for you. It also comes with many sample php files. Recommended!



Nov1

GROUP_CONCAT

Posted by: Brian Chan | Filed in: technology | Tags: , | 01:06 pm, November 1st, 2011 No Comments »

GROUP_CONCAT is very useful when you are doing a GROUP BY, and want to capture the values from other columns. For example:

1
2
3
4
SELECT country, GROUP_CONCAT(city SEPARATOR ',') AS city_arr_str
FROM myTable
GROUP BY country
ORDER BY country

Then in say PHP, you can read the city names for each returned row:

1
2
3
4
5
6
7
foreach ($rows as $r){
  $country = $r["country"];
  $city_arr_str = $r["city_arr_str"];
  // parse the city names
  $city_arr = explode(",", $city_arr_str);
  // do your stuff...
}

Everything is fine and all, until later on your database gets much bigger.

Imagine now your db table gets lots of duplicates, and each country has more than 1000 cities. When you run the same query and code, you will notice something seems wrong… It turns out there is a length limit to what GROUP_CONCAT returns. The length is 1024 characters.

To change that limit, go to your information_schema db, search for GROUP_CONCAT_MAX_LEN in the GLOBAL_VARIABLES table, then change the default value 1024 to something higher. Below is the sql query.

1
SET [GLOBAL | SESSION] group_concat_max_len = 1024;

But if you don’t have the permission to make the above change, or you just want a quick fix, below is a way to get rid of the duplicates hopefully the returned string is shorter than 1024 characters.

1
2
3
4
SELECT country, GROUP_CONCAT(DISTINCT city ORDER BY city SEPARATOR ',') AS city_arr_str
FROM myTable
GROUP BY country
ORDER BY country

I threw in the “ORDER BY” so we don’t need to do a sort again in PHP. Anyhow, that’s the idea. Happy programming! ;)



Jun18

mysql python installation error on osx

Posted by: Brian Chan | Filed in: DIY | Tags: , , | 04:52 pm, June 18th, 2011 No Comments »

I installed MySQL-Python (version 1.2.3 at the time being) from http://sourceforge.net/projects/mysql-python/ on my osx 10.6, python version 2.6.1. Just do the usual “python setup.py build” and “python setup.py install”. However though, in the python cli, when I do “import MySQLdb”, I got some error.

1
2
3
4
5
6
7
8
9
10
>>> import MySQLdb
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "build/bdist.macosx-10.6-universal/egg/MySQLdb/__init__.py", line 19, in <module>
  File "build/bdist.macosx-10.6-universal/egg/_mysql.py", line 7, in <module>
  File "build/bdist.macosx-10.6-universal/egg/_mysql.py", line 6, in __bootstrap__
ImportError: dlopen(/Users/birdchan/.python-eggs/MySQL_python-1.2.3-py2.6-macosx-10.6-universal.egg-tmp/_mysql.so, 2): Library not loaded: libmysqlclient.18.dylib
  Referenced from: /Users/birdchan/.python-eggs/MySQL_python-1.2.3-py2.6-macosx-10.6-universal.egg-tmp/_mysql.so
  Reason: image not found
>>>

Obviously, the error is “Library not loaded: libmysqlclient.18.dylib”. A quick fix is to simply create a symlink.

1
sudo ln -s /usr/local/mysql/lib/libmysqlclient.18.dylib /usr/lib/libmysqlclient.18.dylib

I suppose you can also play with your env path.



May25

mod rewrite

Posted by: Brian Chan | Filed in: technology | Tags: | 10:32 pm, May 25th, 2011 No Comments »

If you want to enable wordpress’ permalinks, it’s actually quite simple. Instead of displaying:

http://birdchan.com/home/?p=2034

you can have it displayed as:

http://birdchan.com/home/2011/05/25/mod-rewrite/

Picking the permalink format in the wordpress admin page is quite easy. The part that is a little tricky is on the server. Here are the steps:

1. If you have done it right, the .htaccess in your wordpress directory should look something like this:

1
2
3
4
5
6
7
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /home/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /home/index.php [L]
</IfModule>

2. Then check your apache conf file. I am using ubuntu apache2 so your mileage may vary. Make sure you see the following settings enabled in /etc/apache2/sites-available/default

1
2
3
4
5
<Directory /var/www/>
       Options FollowSymLinks
       AllowOverride All
       ...
</Directory>

3. Then restart apache

1
sudo /etc/init.d/apache2 restart


Mar28

Perl Arrays of Arrays

Posted by: Brian Chan | Filed in: Uncategorized | Tags: | 03:18 pm, March 28th, 2011 No Comments »

Coming from the python background, this arrays of arrays in perl has given me so much pain. I decided to put down this cheat sheet here for future reference. Basic perl knowledge is assumed.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
    # assign to our array, an array of array references
    @AoA = (
           [ "fred", "barney" ],
           [ "george", "jane", "elroy" ],
           [ "homer", "marge", "bart" ],
    );
    print $AoA[2][2];
 
    # assign a reference to array of array references
    $ref_to_AoA = [
        [ "fred", "barney", "pebbles", "bambam", "dino", ],
        [ "homer", "bart", "marge", "maggie", ],
        [ "george", "jane", "elroy", "judy", ],
    ];
    print $ref_to_AoA->[2][2];

For @array, use parentheses, i.e. the top example. For a reference, use square brackets, i.e. the bottom example.

Example for populating an array:

1
2
3
my @array = ();
push @array, ["John Smith", 35];
push @array, ["Mary Brown", 27];

Notice the square brackets here. This is because we want to keep John and 35 as one entity, so we use the square brackets here to make a reference being pushed to the array. If we use the parentheses here it will be a disaster.

How to read the above?

1
2
3
4
5
foreach my $entity (@array){
  my $name = $entity->[0];
  my $age = $entity->[1];
  # do something...
}

If in python, it will be hecka easy. There is no worries about square brackets or parentheses, arrays or references. The mind can be fully devoted to the logic flow, instead of the language syntax.

?View Code PYTHON
1
2
3
4
5
6
7
L = []
L.append(["John", 35])
L.append(["Mary", 27])
for entity in L:
  name = entity[0]
  age = entity[1]
  print name + " : " + str(age)

There is more pain to this perl arrays of arrays thing… Learn more of it at: http://sunsite.ualberta.ca/Documentation/Misc/perl-5.6.1/pod/perllol.html

Gosh, everyone should be using python, at least not perl.



Mar1

MySQL Storage Engines

Posted by: Brian Chan | Filed in: technology | Tags: | 02:50 pm, March 1st, 2011 No Comments »

Whenever I need to create a mysql db, I always wonder why there are so many storage engines to choose from. According to the official mysql site (http://dev.mysql.com/doc/refman/5.1/en/storage-engines.html), there are currently 10 storage engines available:

MyISAM
InnoDB
IBMDB2I
MERGE
MEMORY
EXAMPLE
FEDERATED
ARCHIVE
CSV
BLACKHOLE

Which one to choose from?

I have heard that MyISAM and InnoDB are the two most popular ones. But why? I am still not sure. Maybe someday I should gather all the info and write a little comparison table.

Meanwhile, use this handy tool to help you pick your storage engine: http://www.scaledb.com/engineselector.html



Feb16

Perl XML modules

Posted by: Brian Chan | Filed in: technology | Tags: , | 05:05 pm, February 16th, 2011 No Comments »

XML::Simple

Provides an extremely simple API to read and write XML into and out of Perl native data structures. Simple for simple things, but has some gotchas, which led to the creation of a "strict mode" for the API and a guide for switching to XML::LibXML.

XML::Twig

Very powerful module to read and process even huge XML documents. Allows you to process the XML one "twig" (tag with all its children and content) at a time, giving you the data either as native Perl datastructure or a structure of objects. See also: xmltwig.com

XML::Rules

An event based XML parser. It allows you to specify actions to execute once a tag is fully parsed (including the children) and passes the data produced by the actions to the ones attached to the parent tags.

XML::LibXML

This is really the best option in the general case; very powerful. It’s a wrapper for the libxml2 C library so it’s very fast and has low memory usage. It has a complete and compliant implementation of XPath, which is almost the only sane way to deal with XML. It takes a little to get used to the module because there is only terse reference documentation, but it’s worth toughing out the initial learning curve, and if you already know the DOM API you have a sizeable head-start. There is also XML::LibXSLT which uses the c library libxslt (a companion to libxml2) to perform XSL Transformations (XSLT).

See also: The Perl XML FAQ


Modules that should not be used any more

These modules have had their use but are no longer properly maintained, and there are better alternatives to using them:

  • XML::Parser was the first XMLmodule for Perl. It is based on the expat library. It is very low level, and barely maintained. In most cases XML::LibXML will be faster and more powerful, and XML::Twig more convenient. XML::Parser is still a useful module, but more as a low-level layer on which more convenient modules can be built.
  • XML::XPath appears to be no longer maintained, XML::LibXML is more efficient, more powerful and better maintained. It is also really easy to port code from XML::XPath to XML::LibXML as they both implement the DOM and XPath in similar ways.
  • XML::DOM was the first implementation of the DOM in Perl. It is based on XML::Parser and expat. It is now barely maintained, and XML::LibXML should be used instead. Porting the code should be straightforward. If you can't switch to XML::LibXML, because you can't or don't want to install libxml2, then at least have a look at XML::DOM::XPath which adds XPath support to XML::DOM.

ref link: https://www.socialtext.net/perl5/index.cgi?recommended_xml_modules



Nov12

pretty_print xml

Posted by: Brian Chan | Filed in: technology | Tags: , | 03:25 pm, November 12th, 2010 No Comments »

Imagine you received a huge xml file but without line breaks… It’s one long line… Then you were told to make use of one section of it…

I know how to view the xml structure by using the XML Notepad, but I will have problems to copy and paste certain parts of it out.

I tried to replace “>” with “>\n” and it was a little better at first, but I would then have to manually insert the indentations… I was tempted to write a quick and dirty xml parser… then I was like, I don’t think I am the only one having this issue.

I did some searches and found that there is an easy way to pretty print the xml files. I am on win7 btw.

First install Notepad++. You can download it at their official site or use the portable version.

Then copy and paste your xml string into notepad++. Opening your file is fine too. It looks pretty good already, but you can make it even better.

Go to TextFX -> TextFX HTML Tidy -> Tidy: Reindent XML

You are done!

P.S. I found an online tool at: http://www.shell-tools.net/index.php?op=xml_format



Aug17

jQuery vs MooTools

Posted by: Brian Chan | Filed in: technology | Tags: , , | 11:57 am, August 17th, 2010 No Comments »

If you are looking into learning javascript frameworks, this will likely be a question you will ask eventually. jQuery or MooTools? They are both very well known and well developed. By chance I came across this site http://jqueryvsmootools.com/ and I was quite impressed. The author did a pretty comprehensive analysis of these two frameworks. Good stuff! Be warned though, it’s quite a long read! ;)



Jul21

itheme

Posted by: Brian Chan | Filed in: technology | Tags: , | 07:05 am, July 21st, 2010 No Comments »

Currently my blog theme is itheme from N.Design Studio. You can download it at this link: http://www.ndesign-studio.com/wp-themes/itheme.

If you try to install it yourself, you will notice something is missing. Exactly, the calendar icon is missing, and the date is not showing. Being a lazy programmer, it was ok for me at first. But after a while it bothered me so much that, I was so determined to fix it. It took me a while, but it’s worth it! ;)

Just so you don’t have to go through all the trouble, here are the steps:

  1. Download http://www.ndesign-studio.com/demo/wordpress/wp-content/themes/itheme-1-1/style.css
  2. Download http://www.ndesign-studio.com/demo/wordpress/wp-content/themes/itheme-1-1/print.css
  3. Download http://www.ndesign-studio.com/demo/wordpress/wp-content/themes/itheme-1-1/images/calendar.gif
  4. Save the two css files in your itheme folder, then save the gif file in your itheme/images/ folder.
  5. Now we will modify the single.php file. Open it up with vi or whatever you like
  6. Jump to line 9, in-between the div line and the h2 line, stick in these two lines:
    9
    10
    
           <div class="date"><span><?php the_time("M"); ?></span><?php the_time("j"); ?></div>
           <div class="title">
  7. Now jump to your new line 14, between postdata and entry, stick in the following line:
    14
    
            </div>
  8. Save single.php. That is for the individual posts. Now do the same thing to index.php for the front page.
  9. Then you are done. Reload your website and see the nice calendar icon! ;)

To double check, your final code in single.php should look like the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
<?php get_header(); ?>
  <div id="content">
 
  <div class="post-nav"> <span class="previous"><?php previous_post_link('%link') ?></span> <span class="next"><?php next_post_link('%link') ?></span></div>
 
  <?php if (have_posts()) : while (have_posts()) : the_post(); ?>
 
    <div class="post" id="post-<?php the_ID(); ?>">
        <div class="date"><span><?php the_time("M"); ?></span><?php the_time("j"); ?></div>
        <div class="title">
        <h2><a href="<?php echo get_permalink() ?>" rel="bookmark" title="Permanent Link: <?php the_title(); ?>"><?php the_title(); ?></a></h2>
 
                <div class="postdata"><span class="left mini-category">Filed in: <?php the_category(', ') ?></span> <span class="right mini-add-comment"><a href="#respond">Add comments</a></span></div>
         </div>
 
                <div class="entry">
                <?php the_content('<p class="serif">Continue reading &raquo;</p>'); ?>
 
                <?php link_pages('<p><strong>Pages:</strong> ', '</p>', 'number'); ?>
 
                <?php edit_post_link('Edit', '', ''); ?>
 
                </div><!--/entry -->
 
                <?php comments_template(); ?>
 
                        <?php endwhile; else: ?>
 
                <p>Sorry, no posts matched your criteria.</p>
 
<?php endif; ?>
 
        </div><!--/post -->
 
  </div><!--/content -->
 
  <div id="footer"><a href="http://www.ndesign-studio.com/resources/wp-themes/">WP Theme</a> &amp; <a href="http://www.ndesign-studio.com/stock-icons/">Icons</a> by <a href="http://www.ndesign-studio.com">N.Design Studio</a></div>
</div><!--/left-col -->
 
<?php get_sidebar(); ?>
 
<?php get_footer(); ?>