Friday, 1 November 2013

simplified molecular-input line-entry system (SMILES)

Ooops, it's not this kind of smile, heeee

Assalamualaikum :-]

After those PDB and HTML, now, we will introduce to all of you about simplified molecular-input line-entry system or simply (SMILES).

SMILES is a specification in form of a line notation for describing the structure of chemical molecules using short ASCII strings.

SMILES notations are comprised of atoms (designated by atomic symbols), bonds, parentheses (used to show branching), and numbers (used to designate ring opening and closing position.

1) atoms:

      atoms are represented by their atomic symbols such as C for carbon, N for nitrogen, Cl for Chlorine and etc. For exmples :
Compound           Molecular Formula          SMILES Notation
---------                ----------------                ---------------
Ethylene                    CH2=CH2                           C=C
Propylene               CH2=CH-CH3                      C=CC
2-Butene              CH3-CH=CH-CH3                CC=CC

2) bonds:
     the four basic bonds are single, double, triple and aromatic bond. Single bond can be shown by the hyphen symbol "-" but usually it is not shown. Meanwhile, triple bond is designated by the symbol "#".

Compound         Bond                      SMILES Notation
-------------    -----------               ---------------------
Ethylene            single & double                  C=C
Propylene          single & double                 C=CC
Acetylene          triple                                  C#C
Propyne            triple                                  C#CC

There is no designation for aromatic ring. It is simply being shown by the lower case letter of atomic symbols. For instance in benzene ring:

SMILES notation for benzene is c1ccccc1
The use of the numbers as ring opening and closing positions is discussed in section 4.

3) branches
      the SMILES notation for branches compound is shown by parentheses. As an example:
4) cyclic structure
      here numbers 1 through 9 are used to indicate the starting and terminating atoms. There are some rules need to be followed:
    a. The SAME number (1, 2, 3, etc.) is used to indicate the starting and terminating atom for each ring. The starting and terminating atom must be connected to each other!

    b. Each number that is used (1, 2, 3, etc.) MUST appear twice and ONLY twice in the entire SMILES notation. 
    c. A starting or terminating atom can be associated with two consecutive numbers. For example, naphthalene can be coded as: c12ccccc1cccc2 (see the example below). The "12" following the first carbon indicates that the first carbon is connected to both of the following numbered carbons.


Table below shows the all of the SMILES notation:

Aspects SMILE Notations
Atoms Atomic symbols
Bonds single    "-"
double  "="
triple     "#"
Branches parentheses, ( )
Cyclic Structure
(opening and closing) 
Numbers 1 to 9


Try it and keep smiling C=


Post a Comment