nemoguardrails.kb.utils
Module Contents
Functions
API
Splits a markdown content into topic chunks.
This function takes a markdown content as input and divides it into topic chunks based on headings and subsections. Each chunk includes a title and body, with an optional maximum size.
Parameters:
- content (str): The markdown content to be split.
- max_chunk_size (int): The maximum size of a chunk. Default is 400.
Returns: List[dict]: A list of dictionaries, each representing a topic chunk with ‘title’ and ‘body’ keys.
Example:
Note:
- The function considers ’#’ as heading markers.
- Meta information can be included at the beginning of the markdown using triple backticks.