What is Sphinx and what it can do

I’ve seen a lot from my experience from dealing with sphinx integration that many people don’t understand both technically and literally what is SphinxSearch , what can do and most important what is NOT and what CANNOT do.

First , let’s take the idea behind Sphinx . Sphinx probably appeared because it’s creator was not satisfied with MySQL ( or any other db ) performances when doing a search , more exactly when doing a TEXT one .  This main idea is , even now at version 2 ,  what Sphinx does and what is good at : FAST text search .  How it makes that ? Simple explained  , it uses some kind of inverted index . What’s that ? Well , if you have a text , you can represent in two ways : one it to store the text as it is , another one is to store all the words in that text and number of occurences and (maybe ) the position of the word in that text . First thing you need to know about Sphinx is that it never stores the full text .

Now you have some tables and you want Sphinx to do the search job . What exactly happends ? Sphinx needs to get that data . Currently there are 2 ways , either giving him a sql query or an xml file . Careful : a sql query ! It can short or long , but a query . It can be simple or can have 10 joins in it . What is really important is very connected to what you want to search . Even in a classic SQL search , in the end you want something returned from the search. A book , an user , a message , whatever . The important thing is this item you want it have an ID . In MySQL usually it’s a primary key , called id or whatever_id . So getting back to our query , you are throwing some data in Sphinx . Each item needs to have an unique id . Sphinx doesn’t know about primary key , it doesn’t care , for him the id you will give it , it will see it as an integer . What is really important is this integer to be unique , otherwise you will have duplicates . So you have data to Sphinx . A very important difference between Sphinx and MySQL is that , unlike MySQL , Sphinx doesn’t know to join different types of data . You can’t join apples with oranges like in MySQL and you get back a row . Sphinx can mix results from different indexes , but that’s what he will do : MIX . When mixing he will only see the document id . If gets common ids , however he knows to remove the duplicates . This is how you can achieve and pretty much how works distributed indexes and main+delta schemes .

Another very , very important thing : sphinx will return you the document id  , no texts – we already said it doesn’t keep the full text . Of course , it has the possibility , beside the text fields to index integer , boolean, float fields which will be returned . The reason is simple : those will be stored as they are . Last releases can have string attributes , which in contrast to text fields , are returned as well . But be careful ,there are some limitations and those string attributes needs more memory .

Why I said that Sphinx returns the document id ? Simply because only doing a search request in Sphinx is not enough ( in most cases ). Most likely you are displaying the text field(s) or other informations . So you need to make a DB call using the ids you got from Sphinx . You are not getting away from not querying your database . Sphinx does not replace the database server , it completes it .

Now , there are some aspect that are needed to be understood of how Sphinx works in real life . When you have an application that uses the databases , what you actually do ? You do requests to the database : to insert your data , to modify your data , to extract the data . Same way Sphinx works : you need to throw data at him . There are 2 ways to do that , two ways which defines the indexes types Sphinx have :

First way is the so called on-disk index . When you configure it , you give him that query . After that you need to run the indexer –  the tool that runs that query , get the data , put it in the index . You have the data . Ok , but in database you will get new data . Well , you need to run the indexer again to take the data . Wait , what ? Well , how Sphinx can know there’s new data ? What can you do ? Simple way is to run the indexer at a time , like once per day , per hour , per week if you don’t have frequent updates . If your data gets big or you want to have the info faster in the index there’s a procedure called main+delta . Basicly it consist in a big one , updated less frequent and a small one , updated more often . But , at the base , the delta does the same as stated above : it will run a query to get some data . I will not enter in details of the main+delta , I only want to emphasize one thing : using the on-disk , your updates will not appear magicly in Sphinx .

But there is the second index type – RT index or realtime index – that can do that . In short , it works almost as a mysql table : you put data in it ( using some SQL queries ) , data is available in several miliseconds.  What’s the catch : first , RT is considered a bit slower on big ( very big ) indexes than on-disk ones . Second and also very important : unlike on-disk where you can put the indexer tool to do the job, you need to do it . You added something new in the database ? great , immediatly you need to add it in Sphinx .

The way you retrieve the info is the same for both indexes  .  So on-disk : pro – fast & less code work ; con – info not available immediately(not without some tricks ); RT : pro – info available immediately ; con –  more code work & bit slow on huge data .

The less/more work  argument is very important for anyone who wants to add Sphinx to his application .  Once again : on-disk is easier to integrate , but not real time .

One more thing : people think Sphinx can do some Google-search magic on their searches . Well , no . Not by saying 1,2,3 .It has relevancy algorithms implemented , even more in latest results you can create your own rules for ranking , BUT it will NOT make a miracle search . Google does the magic because it’s not just index some data and will know what you want . It records any search you made , what link you clicked from that search , to add a counter on that page so it can be more relevant next time . Even the suggestions he makes in case of mispelling something are not based only on some algorithm to detect the word . That’s not so complicated , it’s called thesaurus , morphology or whatever . A wrong word gets the correct suggestion in a phrase because google recorded how many times the meaning was the right one associated with the words from THAT phrase . Sphinx has morphology too , it has prefix/infix options , it was word forms , but it cannot guess the real meaning of something . He follows some “robotic” rules that are providing to him . Of course , with some work you can really do some very relevant results .  And in many , many cases you have a lot of particular . Rules , rules , rules . You want results to be relevant . You also want to show something close to relevant if relevant is not found . This is called CUSTOM  , it’s not something that you plug-in . You need to analyze the data you have , HOW the users search – this is something to pay attention  . You can do whatever you want to do  if your users will simply not insert the search text as you might think . You might add sphinx in several hours to your project , but to achieve the level of relevancy you want , x more time hours could be needed for that . Let’s not forget that most likely you don’t want to loose so much performance , because improving relevancy can lead to slower speed . In this case , first you need to try to optimize things . If that doesn’t work  … you need more power ( the best is to keep the indexes in memory , also an index can be split in several chunks [actually it’s an index too , but Sphinx will know to mix them] so when a search is made to use the available cores )  .

Sphinx is fast , it’s also pretty smart and not at last , it can scale very well . Also there is a change that it might not fit for you , simply because it’s not suited for that  or  it’s too complicated to achieve what you need . There are other alternatives : Lucene /Solr, even Google search server etc .

In the end , several things good to know :

– from the business point : Sphinx is free , but implementing him into a system is not , you have 3 choices : put your developer(s) to learn about it  , find one that knows to work with him or contract a specialized company (SphinxTech Inc. is the company behind the project ) to help you .

– from a business point too : Sphinx wil not do any magic if your database or your code is slow. I’ve seen situations where simply bad logic was the main cause for slowness . Of course , Sphinx is faster ( a lot ) for some tasks than MySQL .

– from a development point : it’s not like pluging a USB stick and it will work , it’s need to be configurated and integrated . Also it’s good to use the latest version , even if you have to compile it ( compilation is one of the easiest I even seen ) –  it’s continuos developed and new features , fixes and improvments are added .

– It’s a complementar tool to the database , it does not replace the database

Word boosting in Sphinx

disclaimer : I’m not an expert in Sphinx

Boosting words

One feature that Sphinx is missing , but it’s found in Solr/Lucene is word boosting – a.k.a you want some words which if are found, the document gets a weight boost . in Solr/Lucene you can define this by using ^ operator , like word^x , where x is a number and a multiplicator for that word’s score .

How to do that in Sphinx ? Well , it’s another play with the query string . Please note that the procedure will not actually give you the word boosting found on Solr ,it’s more a workaround .

Let’s assume you have a title and a description that are indexed in sphinx .  Let’s take an example “techno music” . You will search this string in both title and description ( you might have different weight set for each field , in most cases title gets a field weight boost ) .  Let’s say you have a query like “^techno music$”|”techno music”|”techno music”~10 to have a general good search ( first is exact match on a field , second is exact phase , third is a 10 words promixity match ) . The reason you want this boosting is mainly because your titles might not have both words and you are intested of those who have techno word –  since music is more general . To boost techno you need to add another |”techno”/1 . What you will achieve is , beside the phase search that should match content , you are matching ( if exist ) the word tehno in the title . It could also match only techno in description , but since you consider it an important term , then it’s no problem , you achieved the desired search. If it’s only one word , then you don’t need the /1 operator . But if you would have techno house as boosted words , your extra query would be |”techno house”/1 , to give a match of at least one .

The list of boosted words can be held in a file which is readed when doing a search . You can also use a memory caching ( memcached , apc ,whatever ) .

UnBoosting words

What happends if you want the opposite ? Let’s take the above example : techno music . Music is a general term . You could implement the above procedure and make a list with music subgenres and boost them , but why not decrease the score for music term ? Sphinx has the stopwords list feature , but if you add music there , it won’t be counted in search . And you don’t want that , because you might want to match music too , if techno is not found . The idea is you want first the docs that match techno and then those that match music .

The implementation is similar with the one for boosting :

  • read the list of words that are not important
  • create a new string that contains the input string but without the words in the list above ( use str_replace to delete them)
  • to your query search add a |”new_string”/1

What we have now it will be existing_query | “new_string”/1 , so we basicly made the new_string to be a boosted list of words . So if you searched for “techno house music” and music is in your not important words , the query will be now “techno house music” | “techno house”/1 . First results will contain techno,house and music , then techno OR house . If you use a more relaxed query like |”techno house music”/10 , then you might get more results  ( with less relevance ) .

Both procedures will help your results especially when you are searching in multiple fields with variable weight . In both cases you will do an OR which search only for some words ( which are considered more important ) in case they can match in smaller fields ( like a title ) . Of course , you could do | @title term  and in this case you are making a match if exist only in that field .

Happy Sphixing 🙂

Synonyms in Sphinx

disclaimer : I’m not an expert in Sphinx

Sometimes when doing a search you want to search not only for the words included in the query , but also after their synonyms , to increase the number of results . Sphinx doesn’t come with this by default . Instead it comes with a feature names “wordforms” . Yet wordforms is not a fully featured synonyms feature . As it’s name , it take cares of forms ( variations) of a word , mispells OR direct several uncommon words to a single one . Bear in mind : a single one. You can’t declare dog > cat and then make cat > dog . So you could do dog > cat and mouse > cat , both will be replaced by cat when searching , but you can’t make to search for all 3.

So , how to do it? Since we don’t have  a feature implemented , only options left is to use the query string  : let’s say we search for “black cat” and we have for cat the synonym dog . Our query will transform from “black cat” into “black cat|dog”. Sphinx will return both “black cat” and “black dog” matches.

How to do that :

– first we create a file ( let’s say it synonyms.txt ) in which we put on every line a synonyms list

– when we receive a query string , we take the query string , split it in words and for every word we search in this file for a match

– match found , we modify the query string to by replacing the word with the words found in that line

– do the search

Problems :

– obviously , response time always grows with the length of the query search ( and filtering etc. ) . This new query with OR operators shouldn’t increase very much the response time ( well , you might notice it if your collection of data is big and you get a lot of traffic )

– searching for the synonyms . Here you could get trouble , especially if the file grows. The simple way is to read each line , explode it and search for every word if is in the array.  This is not very efficient and for a big file this is a problem , since you will consume a lot of memory. Alternatives might be :

  • use grep – it’s pretty fast and will return you the matched line ;
  • use memcached for a matched line of a certain word . You can store a key like ‘synonym_cat’ with content ‘cat|dog’ .  Best would be to have memcached on the same server , to avoid network lagging  .
  • use APC instead of memcached . You could also cache the function that does the search in the file

Here is an example on how to do ( please note that this solution is not optimal for large files ):

Each line of synonyms.txt will look like this :

cat | mouse | dog

If you use another separator , be carefull to replace it with | ( OR operator) when inserting in the query string .

$lines= array();
$synofile = file("synonyms.txt");
foreach($synofile as $line){
   $lines[] = trim($line);
}
$tmp_string = strtolower(str_replace(array('-','+'), " ",$input_string));
foreach ($tmp_string as $word){
  $extraword =false;
  foreach ($lines as $line){
    if(false !==strpos($line,$word)){
       $input_string= str_replace($word, $line, $input_string);
    }
  }
}

As I said , this is not a perfect solution , for example it should test the words for a minimum length .

Personal project : Huzmet.ro

My personal project enters in a pre-beta phase . Currently the design is not ready yet , so it’s more about functionality .
Some tehnical details : made on Zend Framework , uses SphinxSE for searching . There’s no caching used yet , but will use ( most likely) memcached and file caching ( using Zend Cache ). There’s also a plan for using Gearman workers .
The project will start first for Romania , but multi-language is already implementend ( get-text translations) .
Testing is welcomed , especially testing with romanian texts/queries , but english is fine too .
Huzmet is a local services providers directory and more . And by that more one feature is that instead of searching you can make a shout , like a request , the shout is scanned and if matches are found among the providers , they get informed about your shout and they can contact you .
Huzmet.ro

non-input Zend Form Element

Simple case : let’s say you make a form for user to register . You have an email field , right ? Ok , now when the user is logged you have a page where he can modify password and other settings . You should normally want to display the email too , but you don’t want to be editable ( because emails should unique etc. ) .You could show the email separated from the form , but you don’t want that , you want it to be in the same area with the other fields .
One simple approach is to create a “dummy” element , one that can be populated by the populate() function , but actually don’t have an <input> . The way is to create an element that extends Zend_Form_Element or Zend_Form_Element_Xhtml which uses formNote helper . Normally this helper is used by hidden elements ( it’s a dead simple helper , you can check the source in Zend/View/Helper) , but you can use it in this case too :

class App_Form_Element_Xhtml extends Zend_Form_Element_Xhtml
{
public $helper = 'formNote';
}

Now ,you will proubably extend your form when editing ( I use a Base form with the fields and for Edit/Register I extend it with new ones ) . To get the email displayed , you need first to remove the email element ( text type ) and replace it with this one :

$this->removeElement('email');
$fakeEmail = new App_Form_Element_Xhtml('email',array(
                 'required' => false,
                'ignore'=> true,
		'label' => 'Email : ',
		));
$this->addElement($fakeEmail);

You need to add ignore to be true , otherwise $form->getValues() will return you a “” value form the email – this can be fixed by declaring getValue and setValue functions for the element , but setting ignore works too .
The last step is to re-order . If the email is the first you can use $fakeEmail->setOrder(-1) before adding .

Eclipse , xdebug , remote system

This morning I had nothing better to do ( actually I had , but … ) so I thought “let’s try xdebug” . I’m working remote on a EC2 instance using Eclipse PDT . Installing xdebug is pretty easy , apt-get install php5-xdebug or something like that ( on debians) , you need to edit a .ini file then you create a debug profile in eclipse ( bla bla bla there’re a lot of tutorials about it ) .

Hit debug buton annnndd ….. “launching: waiting for xdebug session” . Wtf … doesn’t work .

Gogled a bit more , found that you should have the xdebug option xdebug.idekey to be the same with XDEBUG_SESSION_START used by the IDE . IN this case it’s ECLIPSE_DBGP . Netbeans eg. have netbeans-xdebug I think . Well, you can set it to whatever .

Hit button annndd … “launching: waiting for xdebug session” . Wtf … doesn’t work .

I remember I used xdebug some time ago , but I was running locally . So what could it be ? Well xdebug to help you need a remote address . By default is localhost . If you work remote , you should have instead of localhost your IP . Well , if you have a public IP , lucky you , but I bet you don’t have .

What to do ? SSH tunnel . You can do it . OR you could use the built-in option PDT have 🙂

Zend Forms

Zend Forms are a nice tool in ZF , but it’s a bit weird and looks bloated for new people . One thing is that it follows the decorator pattern and another is that you end up writing some code for a damn html form of several lines . Yet , it offers a validation system and if you have many forms in your project , the decorations might turn in a not bad thing .
After reading the tutorial , first thing you will want to do is to get rid of the “damn” dt/dd . The dt/dd combination was chosen to be default for a number of reasons ( some good ones ) . Yet designers might look weird at you ( even if dt/dd can be handled pretty well ) or you just want div tags or a damn table there .
How to do that ? well ,there are 2 ( no , 3) sollutions : one is to modify the decorators , second is to create your own elements ( maybe with their decorators too) … and last one is to use the ViewScript decorator ( basicly you template the form ) . The second , even if looks nice – to create your own element , sucks because you need to re-create all the input elements . And most of all sucks because you re-invent the wheel . Writing a decorator that pretty much does same thing as the default ones , sucks too.
ViewScript is a nice sollution , the problem I see here is that if you want to change your forms layout , you need to edit all those template files ( or maybe I’m wrong ). It’s a good sollution , but I want to use the normal path , if those zf people made the effort to build it .
So … after googling and googling , I have to say the docs are pretty bad . Even tutorials . They don’t explain exactly how to “reset” the damn decorating .
Let’s take it easly and explain how decorators apply ( and how forms works ) . It’s easy : you give an array which is processed … in the order you gave it . First decorator will be always the ViewHelper . This one renders your <input> element and nothing else ( along with the attributes you set for it ). The next decorator you declare will “embrace” the content you already have . If you have a label , it will be concated with the <input> element . If you have a HtmlTag , it will include your input or whatever is the current content. If you have another Html, it will embrace the new content , -that’s your previous HtmlTag which has the input side .  And so on . It’s like having boxes of different sizes in the order of first is the smallest , last is the biggest .  Pretty much same thing is when you add things in a form class , either they are elements or groups or decorations . There is one thing : you can define order here ( I’ll show later an example).

So , let’s say we want to have table instead of dt/dd .

$this->addElement('text','name',array(
			'filters' => array('StringTrim'),
			'validators' => array(array('StringLength',true,array(3,128))),
			'required' => true,
			'label' => 'Name :',
			'decorators' => array('ViewHelper',
						array(array('linebr' => 'HtmlTag'),'options'=> array('tag' => 'br', 'placement' => 'append','openOnly'=>true)),
						array('Errors'),
						array(array('data'=>'HtmlTag'),'options'=>array('tag'=>'td')),
						array('Label','options'=>array('tag'=>'td')),
						array(array('row'=>'HtmlTag'),'options'=>array('tag'=>'tr')))
			));	

This will render something like this :

<tr>
    <td>
        <label>Name:</label>
    </td>
    <td>
        <input type="text" name="name" id="name" value="">
       <br/>
       <ul><li>Error 1</li></ul>
    </td>
</tr>

Let’s recap : first we have the input element , rendered by ViewHelper , then we have a br line ( use openOnly to not create 2 br’s) – also note we use placement ( if we use prepend , the br will be pasted before the input ) , then we have Errors . Next comes a td tag which will embrace everything we have until now . Now it’s the Label time , instead of dt we have a td . By default it’s prepended . The last one is the tr tag that completes our row . Note that for multiple HtmlTag decorators you need to give them a key(or name , whatever).This is to create a new HtmlTag instance , otherwise the same decorator is applied – you can try removing the keys to see what will happen .
This looks ugly … if you do it for all your elements . But we’re in OO world so we can improve it. But first , let’s say you add several more elements .At the end you will add this :

$this->setDecorators(array('FormElements',array('HtmlTag', array('tag' => 'table')),'Form'));

This will render the table tags . Remember this must be added after you inserted all your elements , more likely after the submit button .
Ok , how can you make this spagetti be more nicer ? Your could use overwriting the default decoratos for elements , but I’m using another alternative : simply create a form with no elements , you create a method that returns the whole decorator array for the elements and you can have something like this :

$this->addElement('text','name',array(
	'filters' => array('StringTrim'),
	'validators' => array(array('StringLength',true,array(3,128))),
	'required' => true,
	'label' => 'Name :',
	'decorators' => $this->decorators()
));	

Looks a bit nicer now . Even more you can have a method :

public function loadDefaultDecorators()
{
      $this->setDecorators(array('FormElements',array('HtmlTag', array('tag' => 'table')),'Form'));
}

This is be the default decorator for the form , so you don’t need to add it in init() . Your form will extends this one instead of default Zend_Form .
So now you can create forms that extends this one and you will get html table output .
It’s very likely that you’ll have several types of decorators , one is for sure the submit button – because you don’t have a label there .

public function submitdecorator()
{
	return array('ViewHelper',
			array(array('data'=>'HtmlTag'),'options'=>array('tag'=>'td')),
			array(array('data2'=>'HtmlTag'),'options'=>array('tag'=>'td','placement'=>"prepend")),
			array(array('row'=>'HtmlTag'),'options'=>array('tag'=>'tr')));
}

Here you go . For submit button you do ‘decorators’ => $this->submitdecorator() . You can have a label too , but it will be ignored , because we didn’t defined the decorator for it . Note that for the second HtmlTag I used prepends , otherwise the cell with the submit button will be in left and we want it in right (to be aligned with the inputs , not the labels , but as you wish ) .

Ok, what’s next : using DisplayGroup . Let’s say you have you form table with several input elements , but you also have 2 checkboxes that you want to be displayed in the same cell .
First , you can’t use the decorator above , because it will render one element per row . Instead you want the 2 checkboxes to be rendered inside the cell and every checkbox to be inside a div for futher styling .

'decorators' => array(
			'ViewHelper',
			array('Label','options'=>array('placement'=>'append')),
			array('HtmlTag','options'=>array('tag'=>'div'))

This will be your decorator for each checkbox . Note that I put the label after the checkbox .
Next is to group them :

$this->addDisplayGroup(array('firstcheckbox','secondcheckbox'), 'nameofgroup',
		array(
			'description' => 'Bla bla bla :',
			'decorators'=>array('FormElements',
					  array(array('data'=>'HtmlTag'),'options'=>array('tag'=>'td')),
					  array('Description','options'=>array('tag'=>'td','placement'=>"prepend")),
				          array(array('row'=>'HtmlTag'),'options'=>array('tag'=>'tr'))
		)));		

‘firstcheckbox’ and ‘secondcheckbox’ are the names of your checkboxes .You always need to add there the elements names , otherwise they are not included in the group . I used Description for filling the left row .

Last thing : I said something about order of elements . How is useful . Let’s say you made a form … user data or something . Somewhere you have an admin are where as admin you can edit this data . Most likely you wil have several additions fields ( like if user is banned or something ) . You can extend the form you have . The problem is when you add new fields , they will be added AFTER the submit button . Why ? because elements are rendered in the order they are added , except if you set an order . By default , there is no order , it’s the normal iteration of the array with elements . But you can set it . The most simple way is to set a high value to the submit button – it will be rendered the last one . You can do it even in your admin form :

$this->getElement('mysubmit')->setOrder(100);

This way you don’t need to touch the initial form .

Run MeeGo in VirtualBox/Linux

Please note that this is most likely deprecated for newer MeeGo

is a new linux distro which targets phones , netbook  and other embedded devices . It’s a joint between Nokia’s Maemo linux and Intel backed Moblin  .  Version 1.0 was released not long ago . Officially supports Nokia N900 and Atom-based netbooks . But of course , if you want to take a look at it , you can run it inside a virtual machine . And from what I saw over the net , there’s already an un-official version that can run on x86 , using a normal kernel .

For the start , you should know that MeeGo is a rpm distribution , so prepair to use yum 😛

I’m using VirtualBox for virtualization , but if you prefer something else , I saw Meego can work on  Qemu and Vmware too. I used VirtualBox 3.1 . First step is to get the cd image from here . VB might not see the .img file , so in open box , select “All files”.

Create a new machine . I installed on 384MB , but the more , the better . You need to enable PAE/PAX , otherwise it won’t boot . After you made the new machine , start it . Select install MeeGo , and bla bla bla ( means a pretty normal linux GUI instalation ) .

At first boot , comes the moment of truth . If you are lucky , it will boot in the graphical desktop . Otherwise …

Boot again , and press Esc key . On the boot line options , deselect “quiet” and boot . To get a terminal press alt+F1 . enter your login , do a “sudo su” and then a

init 3

. This should kill the X server with tries to start , but fails ( use alt+F1 to get back to terminal until it’s killed ). Now .. : do a

yum install wget

and after wget is installed , do a

wget http://202.112.3.1/libglx.so

or

wget http://www.adriannuta.com/wp-content/uploads/2010/08/libglx.so

.  Now you need to copy this to /usr/lib/xorg/modules/extensions/libglx.so . Maybe a

chmod u+s /usr/bin/Xorg

might be needed also.

Now you can reboot ( or do a “init 5” and “startx” , but better reboot ) . Normally , now you should get the GUI .  On my laptop  , mutter ( the GUI ) is SLOW , but look awesome . Not sure why is slow , maybe because of video … or because the kernel is optimized for Atom CPUs . Anyway , works for a preview and you can make an idea of what this OS will become . Practicly you will be able to have (almost) anything that runs on a desktop linux in your phone . It comes with several apps like Chrome , Emphaty ( a gnome IM client ) , Evolution ( mail client ) , Banshee ( media player ) etc.

I’m really curios how Nokia will direct the progress of Meego … well actually how both Intel & Nokia will do it . I’m saying this because Nokia is the patron of KDE and Qt . This Meego first version is based on a gnome/gtk interface ( and even more ,it has Banshee with is based on the damn Mono ( .net clone) ) .
Credits goes to from here in helping me making this work .

[nggallery id=3]

Summer guide for linux distros

This is not a complete list of linux distros news , just several distros that I’m watching.

As you  know , Ubuntu’s second LTS ( long time support) – Lucid Lynx 10.04 is in the wild from some time . I’m a KDE fan , so for last 2 years I used mostly the Kubuntu spin , which some times seemed more stable even than the official Gnome version . The problem with Ubuntu , I think since Jaunty 9.04 , is that Canonical instead of consolidate the desktop version ( make it work as much of possible out-of-box ) , they try to enter mainstream and enterprise with some services , like Ubuntu One . I don’t say it’s a bad idea , but my feeling is that since Jaunty there were regressions in some areas , of which users complained . Not big regressions , but annoying . For example , in 8.10 the 3G modem was working perfectly for me . Since Jaunty problems started .  Another “always problem” was with the proprietary drivers . Luckly for Ubuntu , Ati and Nvidia always provided at least a beta driver in the last moment before the launch .  Of course , some of these problems were created by upstream ( changes in X server seems never ending ).

Unfortunatly for Kubuntu , Canonical invest only one core developer , the rest are from community . This is good , because this distro has the liberty to listen a bit more of what people wants . The problem is , this is not enough . Gnome shit is not stripped enough from Kubuntu , it always get the feeling that more development should be invested in Kubuntu to make it more stable, faster . One on another , I say Kubuntu is a pretty good KDE distro . One nice thing is that beta repos for new KDE versions are available pretty quick for Kubuntu and they are pretty stable – I never could have patience until a new KDE version was put in official repo so I always used beta repos . But as I said , canonical shits from official version are felt in Kubuntu , which often is translated in less speed .

3 weeks ago I used for a week Fedora . Didn’t tried Red Hat’s community distro from some time , even I worked on CentOS servers  . As usually I used the KDE spin-off  . Like Kubuntu , it comes with almost a non-customized KDE interface . Last Fedora – version 13 “Goddard” seems a pretty solid distro. Fedora has a more professional feeling than Ubuntu . Performances seems pretty good , no big problems , except one . Proprietary drivers  🙂 At the moment it was launch , the ATI proprietary driver was not yet working on Fedora .

The big problem with proprietary drivers comes on laptops . For example , I used the new experimental mesa driver , which looks pretty good in performance , BUT power management is not yet ready . Not having the power management translates in less battery . For example , using the open source or mesa driver on my Elitebook  ( has ATI graphics ) , the battery lasts about 2 hours . With the driver for ATI I get 3 hours . It’s a pretty much diference , not to mention that the laptop is more quiet , because the fan don’t run all the time .

Back to Fedora . One particular thing I didn’t liked was the SELinux . Fedora comes with a nice policy manager , but it can be annoying for a home user . Of course , it can disable . One goodies on Fedora are several system-config-* GUI utilities . Made in the gnome style ( simple without options for expert changes ) they are a nice addition that newbies might like it . Another good thing was that Fedora is the only one ( I think …) where I found Eclipse to be on Galileo ( 3.5 ) . Personally , I wouldn’t recomend Fedora to a newbie . It was and still is a distro for power users . It can work in a enterprise env. , the influence from RedHat is felt , even it’s community based.

OpenSuse . If you want a KDE distro , this is the one . These guys work pretty much to make a polished distro . They have a custom KDE interface which is very , very nice 😀 . Didn’t tried OpenSuse for some time . Long ago , it was slower than Ubuntu or Fedora , mostly because of Yast , their utility – a blessing and a curse in the same time . But things evolved ENORMOUS . OpenSuse is a damn fast KDE distro , I was surprised . It beats by far Kubuntu ( or at least that was my feeling ). The current stable 11.2 uses KDE 4.3 , a bit outdated , but you can upgrade to 4.4 . I had some issue with 3G modem  ( not sure if it was because of the 4.4 version ) . 11.3 will be released soon , I’ll definetly install it – I hope ATI and nvidia drivers will be ready by then . Opensuse promotes a lot the new mesa drivers , but as I said before , it can match yet the proprietary . YAST control center is one of the strength of openSuse , yet the yast package manager I found it powerfull, but not very friendly – it looks at first view very criptical .

Mint is a Ubuntu fork . It’s basicly a modified version by the community . In other words , most of canonical shit is taken out , but keeping the official repos . Except this stripping and modification to some packages , the Mint team also developed some nice tools like the Software Manager or the mintUpdate . The latest version ,9 – Isadora , is , of course , forked from Lucid 10.04 . I think it’s the first Gnome desktop I still used it after 3 days :)) . It comes with a custom theme ( a NORMAL layout , 1 panel as bottom taskbar) and runs FAST – I like it .  Almost zero problems , everything worked ok . This is how Ubuntu should be .  I would recomend this for newbies more than Ubuntu .  Even if it’s on Gnome , I can take it … well , until KDE version is released ( unlike others , Mint team don’t use fixed dates for releases – when it’s ready , they throw it in the wild ) . The software manager is very nice , except the apps from official repos, it also have some external . For eg. , you can install skype from here . Of course , if you want more refined managing , there’s always Synaptic . One nice thing is that , for example , it comes with Pidgin and not Emphaty ( or whatever is called ). Personally I don’t understand why Canonical tries to promote Emphaty , when there’s already a gtk IM client , which even KDE users ( yeah, Kopete is a down , at least from my point ) use it .

So, in the end , things to watch :

  • OpenSUSE 11.3 ( aroung 15 July )
  • Mint KDE version and even XFCE one
  • Fedora 13 , but watch out if you need proprietary drivers
  • if you have a netbook and want to experiment , try MeeGo , it’s the next OS for Nokia phones and not only