The MolPort database contains data and prices for over 7 million compounds purchasable from stock and over 20 million made-to-order compounds. Our chemically intelligent and e-commerce-enabled online marketplace at molport.com provides flexible searching options to locate desired compounds, view their commercial data, make selections, and place orders.
Depending on your research needs, you or your colleagues may want more granular access to the MolPort database for deeper analysis of compound sets to enhance compound selection, such as:
As examples, the purchasing department might want to assess volume discounts, shipping costs, and delivery time; computational chemists devising a compound library could want to combine substructure searching with a minimum amount available; and the screening group might want to find closely similar structural analogs to fill some gaps in a 96-well HTS plate.
This deeper analysis is best done by downloading the required data and processing it locally with in- house, open-source, or commercial visualization and analysis tools, and MolPort makes its data available for free download to registered users, with a variety of options to select the optimum data set.
Before downloading data from the MolPort database, users should consider the available options and decide on the following:
The following database subsets are available for download, depending on your project requirements:
This subset contains information on stock products which can be delivered in two weeks or sooner. The files combine Screening Compounds and Building Blocks, marked according to type.
This subset of All Stock Compounds contains screening compounds typically available in milligram quantities. Over 99% of stock screening compounds have a guaranteed purity of over 90% by H-NMR or LCMS.
This subset of All Stock Compounds contains data on all stock building blocks. Building blocks are generally available in larger amounts than screening compounds. They can have higher purity or better characteristics and their lead-time is generally shorter. Building Blocks usually have one or more active functional groups, so they can be used to produce new compounds in specific chemical reactions.
Made-to-order compounds are an addition to the Stock compounds. The files contain over 20 million products (both SC and BB) with predefined prices and suppliers. However, the prices for such compounds should be considered for reference purposes only. Suppliers tend to call such products “virtual”, “tangible”, “back ordered”, “to be synthesized” etc. The lead time for made- to-order compounds varies from 4-6 weeks to 3 months. The estimated synthesis success rate is 50%-80%.
This set combines Stock and made-to-order compounds. The files also contain all historical information on products which were sold out, have no price defined, or do not currently have specific suppliers assigned. In total, this set contains over 40 million chemical structures. Data is provided in SMILES format only.
Additional properties of the compounds in the MolPort database are supplied in properties files, which are compressed, tab-separated text files with MolPort IDs and additional molecule properties as listed in the table below, available for FTP/HTTP download and API access. Additional data includes information such as stock amounts, lead times and price ranges. You can use these files to append information contained in Open Access Files or SMILES files. Note that not all data fields are available in some of the downloadable files.
|Open access FTP files (SDF/ SMILES)||FTP SDF (2D) files||FTP SMILES files||FTP Properties file||FTP Monthly update files (SDF/ SMILES)||HTTP download (SDF/SMILES)||Chemical Search API (also via Excel KNIME, Pipeline Pilot)||Full molecule Load API|
|Data update frequency||Monthly||Monthly||Monthly||Monthly||Monthly||Monthly||Daily||Daily|
|Maximum stock amount verified with supplier||N||Y||N||Y||Y||Y/N||Y||N|
|Maximum unverified amount (as package size)||N||Y||N||Y||Y||Y/N||Y||N|
|Is available as Screening Compound?||N||Y||N||Y||Y||Y/N||N||N|
|Is Building Block?||N||Y||N||Y||Y||Y/N||N||N|
|Best lead time||N||Y||Y||Y||Y||Y||N||N|
|Price range for 1mg||N||Y||Y||Y||Y||Y||N||N|
|Price range for 5mg||N||Y||Y||Y||Y||Y||N||N|
|Price range for 50mg||N||Y||Y||Y||Y||Y||N||N|
|Direct link to molecular page||Y||Y||N||Y||Y||Y/N||Y||Y|
|Download size (compact archive)||2Gb/100Mb||~2Gb||~500Mb||~500Mb||~30-50Mb||2Gb/500Mb||0||0|
|Full file size||20Gb/1Gb||~20Gb||~2Gb||~2Gb||300-500Mb||20Gb/2Gb||0||0|
|Number of files||2 (1 per type)||15||15||1||3*||2 (1 per type)||0||0|
Downloaded chemical structures are provided as SDfiles or SMILES strings.
An SDfile or structure-data file is a text file with a predefined format to store chemical data. It is the most popular standard format to store 2D structures of molecules, listing each atom with coordinates and a connectivity table for atom/atom bonds. Each molecule has its properties stored after the molecule block. Most chemical software can use SDfiles directly or convert them for internal representation using a built-in import process.
SMILES is an acronym for 'Simplified Molecular Input Line Entry Specification'. This format includes atom types and connectivity, but the 2D structure must be generated via software on the fly if it is needed. SMILES is much more compact and better suited for storage in standard databases, text fields or spreadsheets, since each structure is a single compact text string. Most modern chemical software can use this format as well. The most recent formal version of SMILES is OpenSMILES.
A single structure can be coded with SMILES in multiple ways, depending on the starting atom. In order to match identical molecules using SMILES, a set of algorithmic rules is applied to standardize the representation and obtain the so-called Canonical SMILES. Note that obtaining canonical SMILES for some structures may fail, and that the canonicalization rules can differ betweaen software packages. Our SMILES files are tab-separated text column files. They contain structures as SMILES (with ChemAxon SMILES extensions, when needed, Canonical SMILES (created using the Chemistry Development Kit), MolPort IDs, and other properties. The first line contains column headers. If your chemical data software can’t import or open SMILES in text files, you may need to change file extension to “.smi” or “.smiles”. Files can also be opened with a basic text editor or in Microsoft Excel for review.
While the online database is updated daily, the available download files are updated monthly. After an initial download, a Monthly Update File lists added compounds (SDfile/SMILES, MolPort ID, and stock amount) and removed compounds.
More up-to-date data can be obtained by using the Chemical Search API (see below) to query the online MolPort database.
You can download MolPort database files via FTP, HTTP, special tools, and Web Services API, and MolPort data is also available via several third-party platforms.
There are two file sets available: Open Access Files, and Standard, plus a monthly update.
These files allow downloading information on all stock compounds available on MolPort.com from an FTP server with standard credentials (does not require a registered account).
Two files, named MolPortAllStockCompounds (SMILES and SDfile), contain key information on all stock compounds (structures, MolPort IDs and a direct link to the dedicated molecule page on molport.com where additional information about suppliers, prices, purity, delivery time, shipping costs, etc. is available). These files are updated monthly.
An extended version of the files described above is available for users who want access to advanced data. These files can be accessed at no charge, after acquiring a login and password. Please fill out this form to receive your credentials. You will then receive instructions for downloading the files. Note: Only users with corporate/university email addresses can receive valid credentials. Files on our FTP server are stored in folders created monthly and named correspondingly, for example “2018-04”. Any chosen subset (see below section [link to: Data subsets for download]) will contain a list of compressed files, where each file will contain 500,000 compounds in an SDfile or SMILES format with the associated data. Older folders contain information necessary for updating data to the current version (see below).
Each month we create a special folder – “Changed Since Previous Update”, which contains data on added compounds (SDfile and SMILES files) and removed compounds (plain text files with MolPort IDs). The folder “Amount Data” contains tab-separated text files with MolPort IDs and stock amounts. This allows you to update the previously acquired information.
If FTP access is not the ideal option, you can also the data via HTTP. Files obtained with this protocol are duplicates of FTP downloads with one difference - each data subset has only one file to download. Contact us to receive your credential for using this option.
The download process can be automated using scripts, automatic downloading, or workflow automation tools. As an example, this blog post shows how to automate the download process for MolPort files using KNIME.
Web Services API provides programmatic access to the MolPort database for up-to-date information.
Data collected from multiple suppliers is not suitable for storage in SDfile or SMILES formats. Instead, you can use the Full Molecule Load API to download complete supplier data for a specific molecule in JSON format. The data will include all supplier details shown on the dedicated compound pages at molport.com.
Please note that the prices you get via Web Services do not include any possible volume discounts, which apply when ordering larger number of compounds. If you require a formal quotation, we suggest using the MolPort List Search capability, as it calculates the best way to procure your compound set and you can generate a formal quote online as a spreadsheet or in PDF format.
MolPort works collaboratively with third parties to ensure that its data is available through their various software applications and databases. MolPort data is currently available via the following third-party platforms:
If you would like to access MolPort data via another application or platform, please contact us and we will try to make it possible.