Saltar al contingut

Structured Data Format

Structural data files are the files software agents typically use when processing chemical structural information, but can also contain additional information like molecular spectra. In principle you could say that there are two major components any structural data file, the simplified connection table and additional information. In effect the InChI line notation sort of models them, in that the main layer is the simplified connection table and the other layers are the additional information, except that in a structural data files hydrogen can be implicit or explicit (in the InChI they are explicit). So when you look at the different types of structural data files you will see they all have an atom table and a bond table. Information about individual atoms like isotopic definitions are associated with the atom table. That atom table may also indicate the 3d coordinates associated with a specific environment, and if that information is missing software agents will use an energy minimization calculation to determine 3D structure of an isolated atom.

Molfiles are text files which contain structure information for a single molecular compound. SDFs (structure data files) consist of a series of molfiles joined together, together with some additional information about the compounds. They are frequently used for sharing libraries of compound structure data.

702
-OEChem-02271511112D
9 8 0 0 0 0 0 0 0999 V2000
0.5369 0.9749 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
1.4030 0.4749 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2.2690 0.9749 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1.8015 0.0000 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
1.0044 0.0000 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
1.9590 1.5118 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
2.8059 1.2849 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
2.5790 0.4380 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
0.0000 0.6649 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
1 9 1 0 0 0 0
2 3 1 0 0 0 0
2 4 1 0 0 0 0
2 5 1 0 0 0 0
3 6 1 0 0 0 0
3 7 1 0 0 0 0
3 8 1 0 0 0 0
M END
> <ID>
00001
> <DESCRIPTION>
Solvent produced by yeast-based fermentation of sugars.
$$$$

A compound record contains several distinct sections. First, there is a three-line header block. These three lines may contain:

  1. The name of the molecule
  2. Details of the software used to generate the compound structure
  3. A comment

Alternatively, any (or all) of these lines may be left blank.

In the example above, the molecule’s name is “702”, was generated by “-OEChem-02271511112D”, and its comment is blank.


El contingut d'aquest lloc web té llicència CC BY-NC-ND 4.0.

©2022-2025 xtec.dev