oumi.analyze.utils#
Utility functions for the analyze module.
- oumi.analyze.utils.to_analysis_dataframe(conversations: list[Conversation], results: Mapping[str, Sequence[BaseModel] | BaseModel], message_to_conversation_idx: list[int] | None = None) DataFrame[source]#
Convert typed analysis results to a pandas DataFrame.
Creates a DataFrame with one row per conversation, with columns for conversation metadata and all analyzer metrics. Analyzer field names are prefixed with the analyzer name to avoid collisions.
Example
>>> results = {"LengthAnalyzer": [LengthMetrics(...), LengthMetrics(...)]} >>> df = to_analysis_dataframe(conversations, results) >>> print(df.columns.tolist()) ['conversation_id', 'conversation_index', 'num_messages', 'length__total_chars', 'length__total_words', ...]
- Parameters:
conversations – List of conversations that were analyzed.
results – Dictionary mapping analyzer names to results. - For per-conversation results: list of BaseModel (len = num conversations) - For message-level results: list of BaseModel (len = num messages) - For dataset-level results: single BaseModel (will be repeated)
message_to_conversation_idx – Optional mapping from message index to conversation index. Required for proper aggregation of message-level results. If provided, message-level results will be aggregated per conversation.
- Returns:
DataFrame with conversation metadata and all metrics as columns.