Batch analysis of 10 common polymers using TransPolymer AI to calculate pairwise similarities.
Identified PET and Polycarbonate as most similar (98.5%) and exported embeddings for later ML tasks.
- 768-dimensional embeddings per polymer
- Pairwise cosine similarity calculations
- Export to JSON and NumPy formats
- Structure-property relationship insights
Chat Dialogs
Analyze 10 common polymers
PROMPT:
I have a list of 10 polymer structures I need to analyze. Can you generate embeddings
for all of them and help me find which ones are most similar?
CONTEXT: Polymers include: Polyethylene (*CC*), Polypropylene (*CC(C)*),
Polybutylene (*CC(CC)*), Polystyrene (*CC(C1=CC=CC=C1)*), PVC (*CC(Cl)*),
PVDF (*CC(F)(F)*), PMMA (*CC(C)(C(=O)OC)*), Nylon-6, PET, Polycarbonate.
THINKING: I'll mount TransPolymer and generate 768D embeddings for each polymer,
then compute pairwise cosine similarities to find clusters.
RESULT:
All 10 polymers received 768-dimensional embeddings from TransPolymer.
Computed 45 pairwise similarities (10 choose 2 combinations).
View technical details
Top similar polymer pairs
PROMPT:
Which polymers are most similar to each other?
CONTEXT: Cosine similarity ranges from 0 (orthogonal) to 1 (identical).
Values above 0.95 indicate high structural similarity.
THINKING: Sorting the 45 similarity pairs to find the highest and lowest matches.
RESULT:
Top similar pairs:
- PET / Polycarbonate: 98.5% (aromatic polyesters)
- PMMA / Nylon-6: 98.1% (carbonyl functional groups)
- Polypropylene / Polybutylene: 97.8% (polyolefin family)
Least similar:
- Polyethylene / Polycarbonate: 79.6% (simple vs complex)
View technical details
Export embeddings for ML
PROMPT:
Can you save these embeddings for later use in my property prediction model?
CONTEXT: Embeddings can be used as feature vectors for ML models predicting
Tg, mechanical properties, or other polymer characteristics.
THINKING: I'll export in multiple formats for different use cases.
RESULT:
Created files:
- polymer_embeddings.json (human-readable with metadata)
- polymer_embeddings.npy (10x768 NumPy array for direct ML input)
- polymer_names.json (name/SMILES mapping)
View technical details
Technical Details
Embedding Generation
INPUT:
{
"polymers": [
{"name": "Polyethylene", "smiles": "*CC*"},
{"name": "Polypropylene", "smiles": "*CC(C)*"},
{"name": "Polystyrene", "smiles": "*CC(C1=CC=CC=C1)*"},
{"name": "PET", "smiles": "*OC(=O)C1=CC=C(C=C1)C(=O)OCC*"},
{"name": "Polycarbonate", "smiles": "*OC(C)(C)C1=CC=C(C=C1)C(C)(C)O*"}
]
}
OUTPUT:
{
"success": true,
"embeddings": {
"Polyethylene": [0.123, -0.456, ...768 dims...],
"Polypropylene": [0.134, -0.445, ...768 dims...],
"PET": [0.567, -0.234, ...768 dims...]
}
}
Similarity Results
OUTPUT:
{
"top_similar": [
{"polymer1": "PET", "polymer2": "Polycarbonate", "similarity": 0.9848},
{"polymer1": "PMMA", "polymer2": "Nylon-6", "similarity": 0.9814},
{"polymer1": "Polypropylene", "polymer2": "Polybutylene", "similarity": 0.9778}
],
"least_similar": [
{"polymer1": "Polyethylene", "polymer2": "Polycarbonate", "similarity": 0.7962},
{"polymer1": "Polyethylene", "polymer2": "PET", "similarity": 0.8163}
]
}
Export Format
OUTPUT:
# Usage example
import numpy as np
import json
embeddings = np.load("polymer_embeddings.npy") # Shape: (10, 768)
with open("polymer_names.json") as f:
names = json.load(f)
# Use as features for property prediction
X = embeddings
y = [known_tg_values]
model.fit(X, y)
Tools Used
brain_discovery
brain_mount_service
brain_predict
Outcomes
- Generated 768D embeddings for 10 polymers
- PET/Polycarbonate most similar (98.5%) - aromatic polyesters
- Polyethylene most different from complex polymers (79.6%)
- Exported to JSON and NumPy for ML integration