June 5, 2015

Tiguidou or How I Learned to Stop Worrying and Love the Analysis

About three years ago a friend of mine, who was doing her master on essential oils, was struggling with her analyses. This gave me the idea to automate part of the process by creating a small VBA program in Excel that would search in a database of rentention indexes for potential matches. She could then filter the ones that had the best MS match. Curious readers might wonder why the concurrent use of retention indexes is necessary along with MS: some information can be found in the previous post of my colleague Alexis St-Gelais on Retention indexes. He was aiming for at least three positive identification parameters: mass spectrum, polar retention index, and non-polar retention index. It did not works. There was too much variation between the RIs in literature due to method disrepancies and columns manufacturer (see here for an example).

Fast forward a couple of months later. We were offered to carry on the business of two local retired professors who were doing essential oil analyses for distillers (Dr Guy J. Collin and Dr François-Xavier Garneau). The method that they used for their analyses was the same that they were employing during their years as published researcher, and that we still use at Laboratoire PhytoChemia: dual column GC-FID analyses (as explained here). This method has some advantages (and some drawbacks, as for everything). One of them is that you have automatically two confirmations for the molecule identification (two retention indexes), as well as a validation on two columns for compounds concentrations, most of the time solving coelutions. You can easily get a third confirmation by using a MS when in doubt about identifications. The main challenge is that a very experimented analyst is necessary to be able to proceed efficiently. Given this, the idea of having something that would automate part of the job was still growing.

After a bit of programmation, Tiguidou* 1.0 was born. This Excel VBA software allowed us to simply paste the raw integration data in a spreadsheet and made an efficient first sorting for us. It would indeed automatically calculate the RIs on both columns, then, for every peak, search our database for a fitting non-polar RI and check if it was able to pair it with a polar RI. It would then order the results along a gradient of increasing discrepancy between theoretical pairs and observed values. The analyst would then choose from dropdown lists the best matches, or add one if needed. The program would then output a list ready to be copied in the report. This came along with a basic database management system, so we could add newly identified compounds as we found them from MS runs or literature.

*From an expression here in Quebec meaning "Everything's alright". Backronyms suggestions are welcome.

Tiguidou 1.0 Interface
Tiguidou 1.0 Database excerpt
We used this tool intensively for two years with a high success, since we cut the time lost in manually screening the databases by an order of magnitude. But, we felt that it still had some limitations. We decided that a new, more powerful version was needed. After almost 6 months of work, I am proud to announce that we are now working with Tiguidou 2.0. Tiguidou is no more confined to an Excel macro, it is now a web application, with a more dynamic database and a refined assignation algorithm. Version 2.0 can support more than two capillary columns, and tries to create the pairs in every possible way (not just non-polar->polar). It also checks if the percentages observed on both columns are similar to sort the results. The database is now built automatically from our previous reports, and now allows for a simple and full follow-up of which compound was seen where. In some way, it "learns" from each of our reports. A basic report, ready for final revision, is then automatically generated within our company template. We are also working hard on adding new options to the algorithm: to check if a compound was already observed in a given plant before, to develop statistics to automatically determine the expected range of each compound in an oil, to see if there is a possibility of latent chemotype, to correlate the presence of a compound to another, and many more. It may even eventually help us do the integration and peak picking!
Tiguidou 2.0 Interface
Tiguidou 2.0 Database excerpt
All this allows us to perform faster, more efficient, more complete and higher-quality analyses while always keeping a reasonable price and allowing more companies to have access to a good knowledge of their products. We plan to eventually make a public version that may allow producers and distributors to improve their quality control.

No comments:

Post a Comment