Batch Polymer Embedding Analysis

ML Prediction TransPolymer
Batch analysis of 10 common polymers using TransPolymer AI to calculate pairwise similarities. Identified PET and Polycarbonate as most similar (98.5%) and exported embeddings for later ML tasks.

Chat Dialogs

Analyze 10 common polymers

PROMPT: I have a list of 10 polymer structures I need to analyze. Can you generate embeddings for all of them and help me find which ones are most similar?
CONTEXT: Polymers include: Polyethylene (*CC*), Polypropylene (*CC(C)*), Polybutylene (*CC(CC)*), Polystyrene (*CC(C1=CC=CC=C1)*), PVC (*CC(Cl)*), PVDF (*CC(F)(F)*), PMMA (*CC(C)(C(=O)OC)*), Nylon-6, PET, Polycarbonate.
THINKING: I'll mount TransPolymer and generate 768D embeddings for each polymer, then compute pairwise cosine similarities to find clusters.
RESULT: All 10 polymers received 768-dimensional embeddings from TransPolymer. Computed 45 pairwise similarities (10 choose 2 combinations).
View technical details

Top similar polymer pairs

PROMPT: Which polymers are most similar to each other?
CONTEXT: Cosine similarity ranges from 0 (orthogonal) to 1 (identical). Values above 0.95 indicate high structural similarity.
THINKING: Sorting the 45 similarity pairs to find the highest and lowest matches.
RESULT: Top similar pairs: - PET / Polycarbonate: 98.5% (aromatic polyesters) - PMMA / Nylon-6: 98.1% (carbonyl functional groups) - Polypropylene / Polybutylene: 97.8% (polyolefin family) Least similar: - Polyethylene / Polycarbonate: 79.6% (simple vs complex)
View technical details

Export embeddings for ML

PROMPT: Can you save these embeddings for later use in my property prediction model?
CONTEXT: Embeddings can be used as feature vectors for ML models predicting Tg, mechanical properties, or other polymer characteristics.
THINKING: I'll export in multiple formats for different use cases.
RESULT: Created files: - polymer_embeddings.json (human-readable with metadata) - polymer_embeddings.npy (10x768 NumPy array for direct ML input) - polymer_names.json (name/SMILES mapping)
View technical details

Technical Details

Embedding Generation

INPUT:
{ "polymers": [ {"name": "Polyethylene", "smiles": "*CC*"}, {"name": "Polypropylene", "smiles": "*CC(C)*"}, {"name": "Polystyrene", "smiles": "*CC(C1=CC=CC=C1)*"}, {"name": "PET", "smiles": "*OC(=O)C1=CC=C(C=C1)C(=O)OCC*"}, {"name": "Polycarbonate", "smiles": "*OC(C)(C)C1=CC=C(C=C1)C(C)(C)O*"} ] }
OUTPUT:
{ "success": true, "embeddings": { "Polyethylene": [0.123, -0.456, ...768 dims...], "Polypropylene": [0.134, -0.445, ...768 dims...], "PET": [0.567, -0.234, ...768 dims...] } }

Similarity Results

OUTPUT:
{ "top_similar": [ {"polymer1": "PET", "polymer2": "Polycarbonate", "similarity": 0.9848}, {"polymer1": "PMMA", "polymer2": "Nylon-6", "similarity": 0.9814}, {"polymer1": "Polypropylene", "polymer2": "Polybutylene", "similarity": 0.9778} ], "least_similar": [ {"polymer1": "Polyethylene", "polymer2": "Polycarbonate", "similarity": 0.7962}, {"polymer1": "Polyethylene", "polymer2": "PET", "similarity": 0.8163} ] }

Export Format

OUTPUT:
# Usage example import numpy as np import json embeddings = np.load("polymer_embeddings.npy") # Shape: (10, 768) with open("polymer_names.json") as f: names = json.load(f) # Use as features for property prediction X = embeddings y = [known_tg_values] model.fit(X, y)

Tools Used

brain_discovery brain_mount_service brain_predict

Outcomes

Back to Stories