You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: pipeline/tools/partner_pkg_table.py
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -191,7 +191,7 @@ def doc() -> str:
191
191
{{/* File generated automatically by pipeline/tools/partner_pkg_table.py */}}
192
192
{{/* Do not manually edit */}}
193
193
194
-
LangChain Python offers an extensive ecosystem with 1000+ integrations across chat & embedding models, tools & toolkits, document loaders, vector stores, and more.
194
+
LangChain offers an extensive ecosystem with 1000+ integrations across chat & embedding models, tools & toolkits, document loaders, vector stores, and more.
Copy file name to clipboardExpand all lines: src/oss/integrations/splitters/index.mdx
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,14 +7,14 @@ title: "Text splitters"
7
7
There are several strategies for splitting documents, each with its own advantages.
8
8
9
9
<Tip>
10
-
For most use cases, start with the [RecursiveCharacterTextSplitter](/oss/integrations/splitters/recursive_text_splitter). It provides a solid balance between keeping context intact and managing chunk size. This default strategy works well out of the box, and you should only consider adjusting it if you need to fine-tune performance for your specific application.
10
+
For most use cases, start with the [`RecursiveCharacterTextSplitter`](/oss/integrations/splitters/recursive_text_splitter). It provides a solid balance between keeping context intact and managing chunk size. This default strategy works well out of the box, and you should only consider adjusting it if you need to fine-tune performance for your specific application.
11
11
</Tip>
12
12
13
13
## Text structure-based
14
14
15
15
Text is naturally organized into hierarchical units such as paragraphs, sentences, and words. We can leverage this inherent structure to inform our splitting strategy, creating split that maintain natural language flow, maintain semantic coherence within split, and adapts to varying levels of text granularity. LangChain's `RecursiveCharacterTextSplitter` implements this concept:
16
16
17
-
- The [RecursiveCharacterTextSplitter](/oss/integrations/splitters/recursive_text_splitter) attempts to keep larger units (e.g., paragraphs) intact.
17
+
- The [`RecursiveCharacterTextSplitter`](/oss/integrations/splitters/recursive_text_splitter) attempts to keep larger units (e.g., paragraphs) intact.
18
18
- If a unit exceeds the chunk size, it moves to the next level (e.g., sentences).
19
19
- This process continues down to the word level if necessary.
20
20
@@ -53,7 +53,7 @@ Types of length-based splitting:
53
53
- Token-based: Splits text based on the number of tokens, which is useful when working with language models.
54
54
- Character-based: Splits text based on the number of characters, which can be more consistent across different types of text.
55
55
56
-
Example implementation using LangChain's CharacterTextSplitter with token-based splitting:
56
+
Example implementation using LangChain's `CharacterTextSplitter` with token-based splitting:
57
57
58
58
:::python
59
59
```python
@@ -89,7 +89,7 @@ Some documents have an inherent structure, such as HTML, Markdown, or JSON files
89
89
:::python
90
90
Examples of structure-based splitting:
91
91
92
-
- Markdown: Split based on headers (e.g., #, ##, ###)
92
+
- Markdown: Split based on headers (e.g., `#`, `##`, `###`)
93
93
- HTML: Split using tags
94
94
- JSON: Split by object or array elements
95
95
- Code: Split by functions, classes, or logical blocks
|[`DirectoryLoader`](/oss/integrations/document_loaders/file_loaders/directory)| Load all files from a directory with custom loader mappings | Package |
64
+
|[`UnstructuredLoader`](/oss/integrations/document_loaders/file_loaders/unstructured)| Load multiple file types using Unstructured API | API |
65
+
|[`MultiFileLoader`](/oss/integrations/document_loaders/file_loaders/multi_file)| Load data from multiple individual file paths | Package |
|[Sitemap](/oss/integrations/document_loaders/web_loaders/sitemap)| Load all pages from a sitemap.xml | ✅ | Package |
83
-
|[Browserbase](/oss/integrations/document_loaders/web_loaders/browserbase)| Load webpages using managed headless browsers with stealth mode | ✅ | API |
84
-
|[WebPDFLoader](/oss/integrations/document_loaders/web_loaders/pdf)| Load PDF files in web environments | ✅ | Package |
76
+
|[`Cheerio`](/oss/integrations/document_loaders/web_loaders/web_cheerio)| Load webpages using Cheerio (lightweight, no JavaScript execution) | ✅ | Package |
|[`Sitemap`](/oss/integrations/document_loaders/web_loaders/sitemap)| Load all pages from a sitemap.xml | ✅ | Package |
83
+
|[`Browserbase`](/oss/integrations/document_loaders/web_loaders/browserbase)| Load webpages using managed headless browsers with stealth mode | ✅ | API |
84
+
|[`WebPDFLoader`](/oss/integrations/document_loaders/web_loaders/pdf)| Load PDF files in web environments | ✅ | Package |
Copy file name to clipboardExpand all lines: src/oss/langchain/models.mdx
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1257,7 +1257,7 @@ Model profile data allow applications to work around model capabilities dynamica
1257
1257
1258
1258
Model profile data can be updated through the following process:
1259
1259
1260
-
1. (If needed) update the source data at [models.dev](https://models.dev/) through a pull request to its [repository on Github](https://github.com/sst/models.dev).
1260
+
1. (If needed) update the source data at [models.dev](https://models.dev/) through a pull request to its [repository on GitHub](https://github.com/sst/models.dev).
1261
1261
2. (If needed) update additional fields and overrides in `langchain_<package>/data/profile_augmentations.toml` through a pull request to the LangChain [integration package](/oss/integrations/providers/overview)`.
1262
1262
3.Usethe [`langchain-model-profiles`](https://pypi.org/project/langchain-model-profiles/) CLI tool to pull the latest data from [models.dev](https://models.dev/), merge in the augmentations and update the profile data:
1263
1263
@@ -1330,7 +1330,7 @@ Model profile data allow applications to work around model capabilities dynamica
1330
1330
1331
1331
Model profile data can be updated through the following process:
1332
1332
1333
-
1. (If needed) update the source data at [models.dev](https://models.dev/) through a pull request to its [repository on Github](https://github.com/sst/models.dev).
1333
+
1. (If needed) update the source data at [models.dev](https://models.dev/) through a pull request to its [repository on GitHub](https://github.com/sst/models.dev).
1334
1334
2. (If needed) update additional fields and overrides in `langchain-<package>/profiles.toml` through a pull request to the LangChain [integration package](/oss/integrations/providers/overview).
Copy file name to clipboardExpand all lines: src/oss/python/integrations/callbacks/confident.mdx
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -151,4 +151,4 @@ You can create your own custom metrics [here](https://docs.confident-ai.com/docs
151
151
152
152
DeepEval also offers other features such as being able to [automatically create unit tests](https://docs.confident-ai.com/docs/quickstart/synthetic-data-creation), [tests for hallucination](https://docs.confident-ai.com/docs/measuring_llm_performance/factual_consistency).
153
153
154
-
If you are interested, check out our Github repository here [https://github.com/confident-ai/deepeval](https://github.com/confident-ai/deepeval). We welcome any PRs and discussions on how to improve LLM performance.
154
+
If you are interested, check out our GitHub repository here [https://github.com/confident-ai/deepeval](https://github.com/confident-ai/deepeval). We welcome any PRs and discussions on how to improve LLM performance.
0 commit comments