neurox.analysis¶
Submodules:
neurox.analysis.corpus¶
Module for corpus based analysis.
This module contains functions that relate neurons to corpus elements like words and sentences
- 
neurox.analysis.corpus.get_top_words(tokens, activations, neuron, num_tokens=0)[source]¶ Get top activating words for any given neuron.
This method compares the activations of the given neuron across all tokens, and extracts tokens that account for the largest variance for that given neuron. It also returns a normalized score for each token, depicting their contribution to the overall variance.
- Parameters
 tokens (dict) – Dictionary containing atleast one list with the key
source. Usually returned fromdata.loader.load_dataactivations (list of numpy.ndarray) – List of sentence representations, where each sentence representation is a numpy matrix of shape
[num tokens in sentence x concatenated representation size]. Usually retured fromdata.loader.load_activationsneuron (int) – Index of the neuron relative to
Xnum_tokens (int, optional) – Number of top tokens to return. Defaults to 0, which returns all tokens with a non-neglible contribution to the variance
- Returns
 top_neurons – List of tuples, where each tuple is a (token, score) element
- Return type
 list of tuples
neurox.analysis.plotting¶
neurox.analysis.visualization¶
- 
neurox.analysis.visualization.visualize_activations(tokens, activations, darken=2, colors=['#d35f5f', '#00aad4'], text_direction='ltr', char_limit=60, font_size=20, filter_fn=<function <lambda>>)[source]¶ Visualize activation values for a particular neuron on some text.
This method returns an SVG drawing of text with every token’s background color set according to the passed in activation values (red for negative values and blue for positive).
- Parameters
 tokens (list of str) – List of tokens over which the activations have been computed. In the rendered image, tokens will be separated by a single space.
activations (list of float) – List of activation values, one per token.
darken (int, optional) – Number of times to render the red/blue background. Increasing this value will reduce contrast but may help in better distinguishing between tokens. Defaults to 2
colors (list of str, optional) – List of two elements, the first indicating the color of the lowest activation value and the second indicating the color of the highest activation value. Defaults to shades of red and blue respectively
text_direction (str, optional) – One of
ltrorrtl, indicating if the language being rendered is written left to right or right to left. Defaults toltrchar_limit (int, optional) – Maximum number of characters per line. Defaults to 60
font_size (int, optional) – Font size in pixels. Defaults to 20px
filter_fn (str or fn, optional) –
Additional functiont that modifies the incoming activations. Defaults to None resulting in keeping the activations as is. If fn is provided, it must accept a list of activations and return a list of exactly the same number of elements. str choices are currently:
top_tokens: Only highlights tokens whose activation values are within 80% of the top activating token in a given sentence. Absolute values are used for comparison.
- Returns
 rendered_svg – A SVG object that you can either save to file, convert into a png within python using an external library like Pycairo, or display in a notebook using the
displayfrom the moduleIPython.display- Return type
 svgwrite.Drawing
- 
class 
neurox.analysis.visualization.TransformersVisualizer(model_name)[source]¶ Bases:
objectHelper class to visualize sentences using activations from a
transformersmodel.- 
model_name¶ A
transformersmodel name or path, e.g.bert-base-uncased- Type
 str
- 
model¶ The loaded model
- Type
 transformersmodel
- 
tokenizer¶ The loaded tokenizer
- Type
 transformerstokenizer
- 
__call__(tokens, layer, neuron)[source]¶ An object of this class can be called directly to get the visualized activations
Examples
>>> visualizer = TransformersVisualizer('bert-base-uncased') >>> svg1 = visualizer(["This", "is", "a", "test"], 0, 10) >>> svg2 = visualizer(["This", "is", "another", "test"], 5, 767)
- 
 
Module contents: