Skip to content

Commit 5e70651

Browse files
xrlclaude
andcommitted
Restructure site as product page for rdkit-rs org
Replace the stream-of-consciousness single page and Debian packaging tutorial with a proper product site: landing page, dedicated pages for rdkit-rs and Cheminee, a Quickwit integration roadmap, and resources. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 1d3778e commit 5e70651

7 files changed

Lines changed: 332 additions & 1065 deletions

File tree

content/_index.md

Lines changed: 33 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -1,80 +1,66 @@
11
---
22
title: RDKit-rs
33
geekdocNav: true
4-
# geekdocAlign: center
54
geekdocAnchor: false
65
---
76

87
[![License](https://img.shields.io/crates/l/rdkit.svg)](https://crates.io/crates/rdkit)
9-
[![Crates.io](https://img.shields.io/crates/v/rdkit.svg)](https://crates.io/crates/rdkit)
8+
[![rdkit crate](https://img.shields.io/crates/v/rdkit.svg?label=rdkit)](https://crates.io/crates/rdkit)
9+
[![cheminee crate](https://img.shields.io/crates/v/cheminee.svg?label=cheminee)](https://crates.io/crates/cheminee)
1010

11-
The power and speed of [RDKit](https://www.rdkit.org/), the safety of Rust! A combination of low level C++ bindings and useful high level Rust
12-
constructs so you can
11+
**Chemistry infrastructure in Rust** — from molecular parsing to full-text structure search.
1312

14-
* Parse mol/molblocks
15-
* Normalize
16-
* Fingerprint
17-
* Enumerate tautomers/canonicalize
13+
The rdkit-rs organization builds safe, fast, open-source tools for cheminformatics on top of [RDKit](https://www.rdkit.org/) and the Rust ecosystem.
1814

19-
How does it work?
2015
---
2116

22-
The rdkit-rs project provides two key libraries: `rdkit` and `rdkit-sys`. The sys package is a collection of low or zero-cost wrappers exposing a key subset of the RDKit C++ functionality. The `rdkit` package builds on top of the sys package, hiding pointers and providing idiomatic Rust interfaces (think: `Debug` and `Clone` implementations, smart borrowing behavior).
17+
## Projects
2318

24-
With the `rdkit` library you will never need to manually free memory or worry about accessing null pointers. You also get all the benefits of an optimizing compiler and will never wait for garbage collection.
19+
### [rdkit-rs](/docs/rdkit-rs/)
2520

26-
Example
27-
---
28-
29-
in your Cargo.toml:
30-
31-
```
32-
[dependencies]
33-
rdkit = "*"
34-
```
35-
36-
If you satisfy the requirements below, the following code should just compile!
21+
Safe Rust bindings for the RDKit C++ library. Parse SMILES and molblocks, normalize molecules, compute fingerprints, enumerate tautomers, calculate descriptors — all with Rust's memory safety guarantees and zero garbage collection overhead.
3722

3823
```rust
3924
use rdkit::{Properties, ROMol};
4025

41-
pub fn main() {
42-
let mol = ROMol::from_smile("c1ccccc1C(=O)NC").unwrap();
43-
let properties = Properties::new();
44-
let computed: HashMap<String, f64> = properties.compute_properties(&mol);
45-
assert_eq!(*computed.get("NumAtoms").unwrap(), 19.0);
46-
}
26+
let mol = ROMol::from_smile("c1ccccc1C(=O)NC").unwrap();
27+
let properties = Properties::new();
28+
let computed = properties.compute_properties(&mol);
29+
assert_eq!(*computed.get("NumAtoms").unwrap(), 19.0);
4730
```
4831

49-
Browse more [rdkit-rs/rdkit examples](https://github.com/rdkit-rs/rdkit/tree/main/examples)
32+
### [Cheminee](/docs/cheminee/)
5033

51-
Requirements
52-
---
34+
A chemical structure search engine built on [Tantivy](https://github.com/quickwit-oss/tantivy) and rdkit-rs. Index millions of molecules and search by substructure, superstructure, exact match, or similarity. Ships as a REST API with Swagger UI, a CLI for batch operations, and a Docker image ready to run.
5335

54-
We support recent stable Rust versions. The limiting factor is whatever [our C++ bindings library, cxx-rs](https://crates.io/crates/cxx), supports. Check [the cxx Cargo.toml](https://github.com/dtolnay/cxx/blob/master/Cargo.toml#L6) to confirm what `rust-version` is supported.
36+
### [Roadmap: Quickwit Integration](/docs/roadmap/)
5537

56-
Requires a recent version of RDKit, tested against `2022.03.1`. Supports both static and dynamic linking, preferring static linking.
57-
You can use a copy of RDKit installed either from [Mac homebrew](https://homebrew.sh) or [Conda Forge](https://anaconda.org/). We are working to
58-
get Debian packages updated for the most recent RDKit and also including static libraries so we can build portable RDKit applications.
38+
We're working to make Cheminee a first-class plugin inside [Quickwit](https://quickwit.io), bringing S3-backed storage, elastic compute scaling, and the full Quickwit operational model to chemical search.
5939

60-
* brew install rdkit
61-
* conda install -c conda-forge rdkit==2022.03.1
40+
---
6241

63-
Ubuntu support is coming soon.
42+
## Get Started
6443

65-
You will also need a compiler for building the sys package's C++ bridge. We recommend clang for the compilation speed.
44+
Install the Rust crate:
6645

67-
Why Rust?
68-
---
46+
```toml
47+
[dependencies]
48+
rdkit = "0.4"
49+
```
6950

70-
Rust is a powerful systems level programming language, offering a smart static typing system, an integrated build system and package manager, and strong memory safety, among many other benefits. Read more about Rust in [the free Rust Book](https://doc.rust-lang.org/book/).
51+
Or run Cheminee in Docker:
7152

72-
Learn More
73-
---
53+
```bash
54+
docker run --rm -p 4001:4001 ghcr.io/rdkit-rs/cheminee:latest
55+
```
7456

75-
Javier Pineda, PhD., presented on Cheminee at a Scientist Show And Tell, [see the slides](/javier-show-and-tell-2023-truncated.html)
57+
Then visit [localhost:4001](http://localhost:4001) for the Swagger UI.
7658

77-
Issues?
7859
---
7960

80-
Please [file an issue on GitHub](https://github.com/rdkit-rs/rdkit/issues)
61+
## Links
62+
63+
- [GitHub Organization](https://github.com/rdkit-rs)
64+
- [rdkit on crates.io](https://crates.io/crates/rdkit)
65+
- [cheminee on crates.io](https://crates.io/crates/cheminee)
66+
- [Resources & Presentations](/docs/resources/)

content/docs/_index.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
---
2+
title: Documentation
3+
geekdocNav: true
4+
geekdocAnchor: false
5+
---
6+
7+
Documentation for rdkit-rs projects.

content/docs/cheminee.md

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
---
2+
title: Cheminee
3+
weight: 2
4+
geekdocNav: true
5+
geekdocAnchor: true
6+
---
7+
8+
[![Crates.io](https://img.shields.io/crates/v/cheminee.svg)](https://crates.io/crates/cheminee)
9+
[![License](https://img.shields.io/crates/l/cheminee.svg)](https://crates.io/crates/cheminee)
10+
11+
Cheminee is a chemical structure search engine. Index chemical structures with arbitrary metadata, then search by substructure, superstructure, exact match, similarity, or descriptor queries. Built on [Tantivy](https://github.com/quickwit-oss/tantivy) and [rdkit-rs](/docs/rdkit-rs/).
12+
13+
Your callers don't need RDKit — just talk to the REST API.
14+
15+
## Key Features
16+
17+
- **Structure search** — Substructure, superstructure, identity (exact match), and Tanimoto similarity search
18+
- **Descriptor search** — Query by any RDKit descriptor (exactmw, NumAtoms, etc.) or custom metadata
19+
- **SMILES standardization** — Fragment parent, uncharger, and canonicalization in bulk
20+
- **Format conversion** — SMILES to molblock and molblock to SMILES
21+
- **Neural similarity search** — Similarity queries use a neural network encoder ([cheminee-similarity-model](https://github.com/rdkit-rs/cheminee-similarity-model)) to embed Morgan fingerprints into a latent space, then search ranked clusters instead of brute-forcing every compound
22+
- **REST API** — OpenAPI-documented endpoints with built-in Swagger UI
23+
- **CLI** — Batch index SDF files, run queries from the terminal
24+
- **Docker image**`ghcr.io/rdkit-rs/cheminee`
25+
- **Ruby client**[cheminee-ruby](https://github.com/rdkit-rs/cheminee-ruby) gem for programmatic access
26+
27+
## API Endpoints
28+
29+
| Method | Path | Description |
30+
|--------|------|-------------|
31+
| POST | `/v1/standardize` | Standardize a list of SMILES |
32+
| POST | `/v1/convert/mol_block_to_smiles` | Convert molblocks to SMILES |
33+
| POST | `/v1/convert/smiles_to_mol_block` | Convert SMILES to molblocks |
34+
| GET | `/v1/schemas` | List available index schemas |
35+
| GET | `/v1/indexes` | List indexes |
36+
| GET | `/v1/indexes/{index}` | Get index details |
37+
| POST | `/v1/indexes/{index}` | Create an index |
38+
| DELETE | `/v1/indexes/{index}` | Delete an index |
39+
| POST | `/v1/indexes/{index}/merge` | Merge index segments |
40+
| POST | `/v1/indexes/{index}/bulk_index` | Index SMILES with metadata |
41+
| DELETE | `/v1/indexes/{index}/bulk_delete` | Delete compounds by SMILES |
42+
| GET | `/v1/indexes/{index}/search/basic` | Basic descriptor/metadata search |
43+
| GET | `/v1/indexes/{index}/search/substructure` | Substructure search |
44+
| GET | `/v1/indexes/{index}/search/superstructure` | Superstructure search |
45+
| GET | `/v1/indexes/{index}/search/identity` | Exact structure match |
46+
47+
## Quick Start with Docker
48+
49+
Run Cheminee:
50+
51+
```bash
52+
docker run --rm -p 4001:4001 ghcr.io/rdkit-rs/cheminee:latest
53+
```
54+
55+
Visit [localhost:4001](http://localhost:4001) for the Swagger UI.
56+
57+
### Index Some Data
58+
59+
Fetch PubChem SDF files and index them:
60+
61+
```bash
62+
docker exec -it cheminee bash
63+
64+
mkdir -p tmp/sdfs
65+
cheminee fetch-pubchem -d tmp/sdfs
66+
cheminee create-index -i tmp/cheminee/index0 -n descriptor_v1 -s exactmw
67+
cheminee index-sdf -s tmp/sdfs/Compound_000000001_000500000.sdf.gz -i tmp/cheminee/index0
68+
```
69+
70+
### CLI Examples
71+
72+
**Basic search** — query by descriptor ranges:
73+
74+
```bash
75+
cheminee basic-search -i /tmp/cheminee/index0 \
76+
-q "exactmw: [10 TO 10000] AND NumAtoms: [8 TO 100]" -l 10
77+
```
78+
79+
**Substructure search:**
80+
81+
```bash
82+
cheminee substructure-search -i /tmp/cheminee/index0 \
83+
-s CCC -r 10 -t 10 -u true -e "exactmw: [20 TO 200]"
84+
```
85+
86+
**Similarity search:**
87+
88+
```bash
89+
cheminee similarity-search -i /tmp/cheminee/index0 \
90+
-s c1ccccc1CC -r 10 -t 10 -p 0.1 -m 0.4
91+
```
92+
93+
## Building from Source
94+
95+
```bash
96+
cargo run --release --package cheminee --bin cheminee -- rest-api-server
97+
```
98+
99+
## Links
100+
101+
- [GitHub: rdkit-rs/cheminee](https://github.com/rdkit-rs/cheminee)
102+
- [crates.io: cheminee](https://crates.io/crates/cheminee)
103+
- [Docker image](https://github.com/rdkit-rs/cheminee/pkgs/container/cheminee)
104+
- [OpenAPI spec](https://github.com/rdkit-rs/cheminee/blob/main/openapi.json)
105+
- [cheminee-ruby](https://github.com/rdkit-rs/cheminee-ruby)
106+
- [cheminee-similarity-model](https://github.com/rdkit-rs/cheminee-similarity-model)

content/docs/rdkit-rs.md

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
---
2+
title: rdkit-rs
3+
weight: 1
4+
geekdocNav: true
5+
geekdocAnchor: true
6+
---
7+
8+
[![Crates.io](https://img.shields.io/crates/v/rdkit.svg)](https://crates.io/crates/rdkit)
9+
[![License](https://img.shields.io/crates/l/rdkit.svg)](https://crates.io/crates/rdkit)
10+
11+
Safe, idiomatic Rust bindings for [RDKit](https://www.rdkit.org/), the industry-standard open-source cheminformatics library.
12+
13+
## What It Provides
14+
15+
The project ships two crates:
16+
17+
- **`rdkit-sys`** — Low-level C++ bindings via [cxx](https://crates.io/crates/cxx). Zero-cost wrappers exposing a key subset of RDKit's C++ API.
18+
- **`rdkit`** — High-level Rust library built on `rdkit-sys`. No manual memory management, no null pointers. Implements `Debug`, `Clone`, and idiomatic borrowing so molecules behave like native Rust types.
19+
20+
## Capabilities
21+
22+
| Area | What You Can Do |
23+
|------|----------------|
24+
| **Parsing** | SMILES, molblocks, SDF files (including gzipped) |
25+
| **Normalization** | Fragment parent, uncharger, canonical tautomer |
26+
| **Fingerprints** | Morgan fingerprints, pattern fingerprints |
27+
| **Descriptors** | Compute all standard RDKit descriptors (exactmw, NumAtoms, CrippenClogP, etc.) |
28+
| **Tautomers** | Enumerate tautomers, canonicalize |
29+
| **Substructure** | SMARTS-based substructure and superstructure matching |
30+
| **Periodic Table** | Element lookups and properties |
31+
32+
## Quick Start
33+
34+
Add to your `Cargo.toml`:
35+
36+
```toml
37+
[dependencies]
38+
rdkit = "0.4"
39+
```
40+
41+
Example:
42+
43+
```rust
44+
use rdkit::{Properties, ROMol};
45+
use std::collections::HashMap;
46+
47+
fn main() {
48+
let mol = ROMol::from_smile("c1ccccc1C(=O)NC").unwrap();
49+
let properties = Properties::new();
50+
let computed: HashMap<String, f64> = properties.compute_properties(&mol);
51+
assert_eq!(*computed.get("NumAtoms").unwrap(), 19.0);
52+
}
53+
```
54+
55+
Browse more examples in the [examples directory](https://github.com/rdkit-rs/rdkit/tree/main/examples).
56+
57+
## Prerequisites
58+
59+
Requires RDKit 2023.09.1 or higher.
60+
61+
**macOS:**
62+
63+
```bash
64+
brew install rdkit
65+
```
66+
67+
**Linux (Ubuntu 24.04+):**
68+
69+
Pre-compiled static library tarballs are available for AMD64 and ARM64:
70+
71+
- [AMD64](https://rdkit-rs-debian.s3.eu-central-1.amazonaws.com/rdkit_2024_03_3_ubuntu_14_04_amd64.tar.gz)
72+
- [ARM64](https://rdkit-rs-debian.s3.eu-central-1.amazonaws.com/rdkit_2024_03_3_ubuntu_14_04_arm64.tar.gz)
73+
74+
You will also need a C++ compiler (we recommend clang) for building the `rdkit-sys` bridge code.
75+
76+
## Rust Version
77+
78+
We support recent stable Rust versions. The limiting factor is [cxx](https://crates.io/crates/cxx) — check the [cxx Cargo.toml](https://github.com/dtolnay/cxx/blob/master/Cargo.toml#L6) for the minimum `rust-version`.
79+
80+
## Links
81+
82+
- [GitHub: rdkit-rs/rdkit](https://github.com/rdkit-rs/rdkit)
83+
- [crates.io: rdkit](https://crates.io/crates/rdkit)
84+
- [crates.io: rdkit-sys](https://crates.io/crates/rdkit-sys)

content/docs/resources.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
---
2+
title: Resources
3+
weight: 4
4+
geekdocNav: true
5+
geekdocAnchor: true
6+
---
7+
8+
## Presentations
9+
10+
Javier Pineda, PhD., presented on Cheminee at a Scientist Show And Tell: [view the slides](/javier-show-and-tell-2023-truncated.html).
11+
12+
## Why Rust?
13+
14+
Rust is a systems programming language offering memory safety without garbage collection, an integrated build system and package manager (cargo), and a strong static type system. For cheminformatics this means:
15+
16+
- **No GC pauses** — consistent latency when searching millions of molecules
17+
- **Memory safety** — no segfaults, no use-after-free, even when wrapping C++ libraries like RDKit
18+
- **Fearless concurrency** — parallelize indexing and search across cores without data races
19+
- **Single binary deployment** — ship a statically linked binary or Docker image with no runtime dependencies
20+
21+
Learn more about Rust in [The Rust Programming Language](https://doc.rust-lang.org/book/).
22+
23+
## Repositories
24+
25+
| Repository | Description |
26+
|-----------|-------------|
27+
| [rdkit-rs/rdkit](https://github.com/rdkit-rs/rdkit) | High-level Rust bindings for RDKit |
28+
| [rdkit-rs/cheminee](https://github.com/rdkit-rs/cheminee) | Chemical structure search engine |
29+
| [rdkit-rs/cheminee-ruby](https://github.com/rdkit-rs/cheminee-ruby) | Ruby client gem for the Cheminee API |
30+
| [rdkit-rs/cheminee-similarity-model](https://github.com/rdkit-rs/cheminee-similarity-model) | Neural network model for similarity search clustering |
31+
| [rdkit-rs/rdkit-rs.github.io](https://github.com/rdkit-rs/rdkit-rs.github.io) | This website |
32+
33+
## Issues
34+
35+
Please [file an issue on GitHub](https://github.com/rdkit-rs/rdkit/issues) for rdkit-rs, or [here for Cheminee](https://github.com/rdkit-rs/cheminee/issues).

0 commit comments

Comments
 (0)