Visualize Embeddings in 3D

Explore how language models represent words as points in a high-dimensional space. Use PCA, t-SNE, or UMAP to project them and discover semantic structure.

scatter_plotPCA, t-SNE, UMAP projection

searchSemantic nearest-neighbor search

upload_fileUpload your own embeddings

grid_on What is an embedding? expand_more

An embedding is a dense numerical representation of a word, phrase, or token. A model converts text into vectors of hundreds or thousands of dimensions, where the distance between vectors reflects semantic relationships.

analytics PCA — Principal Component Analysis expand_more

PCA reduces dimensionality while preserving maximum variance. It is fast and deterministic, ideal for seeing global structure. The first 3 components capture the axes of greatest separation between concepts.

hub t-SNE — Local structure expand_more

t-SNE groups nearby points in the original space while preserving local structure. It reveals clusters of similar words. It does not preserve global distances: two separated clusters do not imply they are conceptually distant.

scatter_plot UMAP — Local/global balance expand_more

UMAP preserves global structure better than t-SNE and is faster. It maintains both local proximity and inter-group relationships. It is the modern standard for exploring embedding spaces.

compare_arrows Vector analogies expand_more

In a well-trained space, king − man + woman ≈ queen. This vector arithmetic emerges from the statistical structure of the corpus. Select words in the Projector and observe their nearest neighbors.