Commercially Available Compound Database Access


The MolPort database contains data and prices for over 7 million compounds purchasable from stock and over 20 million made-to-order compounds. Our chemically intelligent and e-commerce-enabled online marketplace at molport.com provides flexible searching options to locate desired compounds, view their commercial data, make selections, and place orders.

Depending on your research needs, you or your colleagues may want more granular access to the MolPort database for deeper analysis of compound sets to enhance compound selection, such as:

  • Filtering
  • Diversity analysis
  • Clustering
  • Virtual screening
  • Virtual docking

As examples, the purchasing department might want to assess volume discounts, shipping costs, and delivery time; computational chemists devising a compound library could want to combine substructure searching with a minimum amount available; and the screening group might want to find closely similar structural analogs to fill some gaps in a 96-well HTS plate.

This deeper analysis is best done by downloading the required data and processing it locally with in- house, open-source, or commercial visualization and analysis tools, and MolPort makes its data available for free download to registered users, with a variety of options to select the optimum data set.

Summary

Before downloading data from the MolPort database, users should consider the available options and decide on the following:

  • Which compounds? (see details)

    Various subsets are available.
    • All stock compounds (screening compounds and building blocks)
    • All stock screening compounds
    • All stock building blocks
    • Made-to-order compounds
    • Full database (all the above plus sold-out compounds, and compounds with no price or specific supplier)
  • Which data? (see details)

    • In addition to structure, MolPort ID, and a link to the record in the online database, what other data fields (such as pack size, lead time, price range, InChI/InChI Key, IUPAC name, etc.) are required? Please note that not all data fields are available in every format.
  • Which format? (see details)

    • Structures are available as SDfiles and SMILES strings. We also include a canonical SMILES generated using the Chemistry Development Kit.
    • Other data is embedded in the SDfiles or held in tab-delimited text files containing the SMILES strings.
  • Update frequency? (see details)

    • While the online database is updated daily, the available download files are updated monthly. After an initial download, a Monthly Update File lists added compounds (SDfile/SMILES, MolPort ID, and stock amount) and removed compounds.
    • More up-to-date data can be obtained by using the Chemical Search API (see below) to query the online MolPort database.
  • Access method? (see details)

    • Access to the MolPort database is available via FTP, HTTP, downloads with special tools such as KNIME, Web Services API, and via other third-party platforms.
    • There are two FTP download options:
      • Open Access Files are available at no charge, and with no need to register, and contain all stock compounds with structure (SDfile/SMILES), MolPort ID, and a link to the compound page in the online database. These files are updated monthly.
      • Standard FTP Download is available at no charge to registered users (limited to corporate and university email addresses), and provides all the data in the Open Access Files plus data elements such as stock amount, pack size, lead time and price range. These files are updated monthly with a Monthly Update File.
    • HTTP Access provides the same data as the Standard FTP Download, but each database subset only has one file to download.
    • Download with special tools: the download process can be automated using scripts, automatic downloading, or workflow automation tools. This example shows how to automate the download process using KNIME.
    • Web Services API provides direct programmatic access to the online MolPort database for checking up-to-date inventory status.
      • Chemical Search API provides access via KNIME nodes, Pipeline Pilot protocols, and Microsoft Excel templates. We also provide examples based on Java, JavaScript, C#, and Python.
      • Full Molecule Load API allows downloading the full supplier data for a specific molecule in JSON format.
    • Other Third-Party Platforms: MolPort data is also available via these platforms: Binding DB (subset), ChemSpider, Optibrium StarDrop (API instant access), PubChem, Schrödinger, Zinc (including API instant access), ZINCPharmer

More Details

The following database subsets are available for download, depending on your project requirements:

Database

All Stock Compounds

This subset contains information on stock products which can be delivered in two weeks or sooner. The files combine Screening Compounds and Building Blocks, marked according to type.

  • FTP folder name: "All Stock Compounds"
  • File names start: "IIS"

All Stock Screening Compounds

This subset of All Stock Compounds contains screening compounds typically available in milligram quantities. Over 99% of stock screening compounds have a guaranteed purity of over 90% by H-NMR or LCMS.

  • FTP folder name:"Screening Compounds"
  • File names start:"IISSC"

All Stock Building Blocks

This subset of All Stock Compounds contains data on all stock building blocks. Building blocks are generally available in larger amounts than screening compounds. They can have higher purity or better characteristics and their lead-time is generally shorter. Building Blocks usually have one or more active functional groups, so they can be used to produce new compounds in specific chemical reactions.

  • FTP folder name: "Building Blocks"
  • File names start: "IISBB"

Made-to-order Compounds

Made-to-order compounds are an addition to the Stock compounds. The files contain over 20 million products (both SC and BB) with predefined prices and suppliers. However, the prices for such compounds should be considered for reference purposes only. Suppliers tend to call such products "virtual", "tangible", "back ordered", "to be synthesized" etc. The lead time for made- to-order compounds varies from 4-6 weeks to 3 months. The estimated synthesis success rate is 50%-80%.

  • FTP folder name: "Made To Order"
  • File names start: "V"

Full Database

This set combines Stock and made-to-order compounds. The files also contain all historical information on products which were sold out, have no price defined, or do not currently have specific suppliers assigned. In total, this set contains over 40 million chemical structures. Data is provided in SMILES format only.

  • FTP folder name: "Full DB"
  • File names start with: "fulldb"

Additional properties of the compounds in the MolPort database are supplied in properties files, which are compressed, tab-separated text files with MolPort IDs and additional molecule properties as listed in the table below, available for FTP/HTTP download and API access. Additional data includes information such as stock amounts, lead times and price ranges. You can use these files to append information contained in Open Access Files or SMILES files. Note that not all data fields are available in some of the downloadable files.

Open access FTP files (SDF/ SMILES) FTP SDF (2D) files FTP SMILES files FTP Properties file FTP Monthly update files (SDF/ SMILES) HTTP download (SDF/SMILES) Chemical Search API (also via Excel KNIME, Pipeline Pilot) Full molecule Load API
Data update frequency Monthly Monthly Monthly Monthly Monthly Monthly Daily Daily
Structure Y Y Y N Y Y Y Y
MolPort ID Y Y Y Y Y Y Y Y
Maximum stock amount verified with supplier N Y N Y Y Y/N Y N
Maximum unverified amount (as package size) N Y N Y Y Y/N Y N
Is available as Screening Compound? N Y N Y Y Y/N N N
Is Building Block? N Y N Y Y Y/N N N
Best lead time N Y Y Y Y Y N N
Price range for 1mg N Y Y Y Y Y N N
Price range for 5mg N Y Y Y Y Y N N
Price range for 50mg N Y Y Y Y Y N N
QC methods N Y N Y Y Y/N N N
Compound state N Y N Y Y Y/N N N
InChI N N Y Y N Y/N N N
InChI Key N N Y Y N Y/N N N
IUPAC name N N N Y N N N Y
                 
Direct link to molecular page Y Y N Y Y Y/N Y Y
Download size (compact archive) 2Gb/100Mb ~2Gb ~500Mb ~500Mb ~30-50Mb 2Gb/500Mb 0 0
Full file size 20Gb/1Gb ~20Gb ~2Gb ~2Gb 300-500Mb 20Gb/2Gb 0 0
Number of files 2 (1 per type) 15 15 1 3* 2 (1 per type) 0 0
                 
Suppliers N N N N N N N Y
Catalog numbers N N N N N N N Y
Prices N N N N N N N Y
Pack sizes N N N N N N N Y
Shipping costs N N N N N N N Y

Downloaded chemical structures are provided as SDfiles or SMILES strings.

SDfiles

An SDfile or structure-data file is a text file with a predefined format to store chemical data. It is the most popular standard format to store 2D structures of molecules, listing each atom with coordinates and a connectivity table for atom/atom bonds. Each molecule has its properties stored after the molecule block. Most chemical software can use SDfiles directly or convert them for internal representation using a built-in import process.

SMILES

SMILES is an acronym for 'Simplified Molecular Input Line Entry Specification'. This format includes atom types and connectivity, but the 2D structure must be generated via software on the fly if it is needed. SMILES is much more compact and better suited for storage in standard databases, text fields or spreadsheets, since each structure is a single compact text string. Most modern chemical software can use this format as well. The most recent formal version of SMILES is OpenSMILES.

A single structure can be coded with SMILES in multiple ways, depending on the starting atom. In order to match identical molecules using SMILES, a set of algorithmic rules is applied to standardize the representation and obtain the so-called Canonical SMILES. Note that obtaining canonical SMILES for some structures may fail, and that the canonicalization rules can differ betweaen software packages. Our SMILES files are tab-separated text column files. They contain structures as SMILES (with ChemAxon SMILES extensions, when needed, Canonical SMILES (created using the Chemistry Development Kit), MolPort IDs, and other properties. The first line contains column headers. If your chemical data software can't import or open SMILES in text files, you may need to change file extension to ".smi" or ".smiles". Files can also be opened with a basic text editor or in Microsoft Excel for review.


While the online database is updated daily, the available download files are updated monthly. After an initial download, a Monthly Update File lists added compounds (SDfile/SMILES, MolPort ID, and stock amount) and removed compounds.

More up-to-date data can be obtained by using the Chemical Search API (see below) to query the online MolPort database.


You can download MolPort database files via FTP, HTTP, special tools, and Web Services API, and MolPort data is also available via several third-party platforms.


There are two file sets available: Open Access Files, and Standard, plus a monthly update.

Open Access Files

These files allow downloading information on all stock compounds available on MolPort.com from an FTP server with standard credentials (does not require a registered account).

Two files, named MolPortAllStockCompounds (SMILES and SDfile), contain key information on all stock compounds (structures, MolPort IDs and a direct link to the dedicated molecule page on molport.com where additional information about suppliers, prices, purity, delivery time, shipping costs, etc. is available). These files are updated monthly.

FTP: ftp://molport.com/
Username: MolPortUser
Password: MolPortUser

Standard FTP Download

An extended version of the files described above is available for users who want access to advanced data. These files can be accessed at no charge, after acquiring a login and password. Please fill out this form to receive your credentials. You will then receive instructions for downloading the files. Note: Only users with corporate/university email addresses can receive valid credentials. Files on our FTP server are stored in folders created monthly and named correspondingly, for example "2018-04". Any chosen subset (see below section [link to: Data subsets for download]) will contain a list of compressed files, where each file will contain 500,000 compounds in an SDfile or SMILES format with the associated data. Older folders contain information necessary for updating data to the current version (see below).

Monthly Update File

Each month we create a special folder - "Changed Since Previous Update", which contains data on added compounds (SDfile and SMILES files) and removed compounds (plain text files with MolPort IDs). The folder "Amount Data" contains tab-separated text files with MolPort IDs and stock amounts. This allows you to update the previously acquired information.



If FTP access is not the ideal option, you can also the data via HTTP. Files obtained with this protocol are duplicates of FTP downloads with one difference - each data subset has only one file to download. Contact us to receive your credential for using this option.


The download process can be automated using scripts, automatic downloading, or workflow automation tools. As an example, this blog post shows how to automate the download process for MolPort files using KNIME.


API Access

Web Services API provides programmatic access to the MolPort database for up-to-date information.

We synchronize data with the chemical suppliers as frequently as possible, and 80% of stock compound inventory data is updated daily. These changes are reflected immediately on our website. However, downloadable files are only updated once a month. The MolPort Web Services API lets you to check up-to-date inventory programmatically. Visit this page to find out more. To access data via Web Sservices, we provide KNIME nodes, Pipeline Pilot protocols, Microsoft Excel templates, and examples based on the Java, JavaScript, C# and Python programming languages.

Data collected from multiple suppliers is not suitable for storage in SDfile or SMILES formats. Instead, you can use the Full Molecule Load API to download complete supplier data for a specific molecule in JSON format. The data will include all supplier details shown on the dedicated compound pages at molport.com.

Please note that the prices you get via Web Services do not include any possible volume discounts, which apply when ordering larger number of compounds. If you require a formal quotation, we suggest using the MolPort List Search capability, as it calculates the best way to procure your compound set and you can generate a formal quote online as a spreadsheet or in PDF format.


MolPort works collaboratively with third parties to ensure that its data is available through their various software applications and databases. MolPort data is currently available via the following third-party platforms:

Interested to get database access?

If you would like to access MolPort data via another application or platform, please contact us and we will try to make it possible.

2019 MolPort, v3.26, release date 12-11-2019 13:00 (+0200). All rights reserved.
This website or its third-party tools use cookies, which are necessary to its functioning and require to achieve the purposes illustrated in the Cookie Policy. If you want to know more or withdraw your consent to all or some of the cookies, please refer to the Cookie Policy. By clicking on Agree and closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to the use of cookies.