You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/redact/redacting_json.rst
+24-12Lines changed: 24 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,15 +1,21 @@
1
1
Redact JSON data
2
2
===================
3
-
To redact sensitive information from a JSON string or Python dict, pass the object to the `redact_json` method:
4
3
5
-
Like other SDK functions that modify data the `redact_html` allows you to configure how different entity types are treated. You can learn more about the common parameters:
4
+
Using redact_json
5
+
-----------------
6
6
7
-
* generator_default
8
-
* generator_config
9
-
* label_allow_lists
10
-
* label_block_lists
7
+
To redact sensitive information from a JSON string or Python dict, pass the object to the ``redact_json`` method:
11
8
12
-
by reading :ref:`redact-config`.
9
+
Similar to other SDK functions that modify data, ``redact_html`` allows you to configure how to treat different entity types.
10
+
11
+
To learn more about the common parameters:
12
+
13
+
* ``generator_default``
14
+
* ``generator_config``
15
+
* ``label_allow_lists``
16
+
* ``label_block_lists``
17
+
18
+
go to :ref:`redact-config`.
13
19
14
20
.. code-block:: python
15
21
@@ -48,14 +54,20 @@ This produces the following output:
48
54
Conversation data stored in JSON
49
55
--------------------------------
50
56
51
-
When conversation data (typically text transcribed from audio recordings) is stored in JSON it is common for different parts of the conversation are found spread across multiple locations in JSON. Using the redact_json method is not ideal because each piece of text is treated independently when performing NER identification. This can result in worse NER identification. The :class:`JsonConversationHelper<tonic_textual.helpers.json_conversation_helper.JsonConversationHelper>` will process entire conversations in single NER calls yielding better performance and then return an NER result that still maps to your original JSON structure.
57
+
When conversation data, such as text transcribed from audio recordings is stored in JSON, different parts of the conversation are often spread across multiple locations in JSON.
52
58
53
-
As an example, let's say you have a JSON document representing a conversation as follows:
59
+
Using ``redact_json`` method is not ideal in this case, because NER identification treats each piece of text independently. This can result in worse NER identification.
60
+
61
+
The :class:`JsonConversationHelper<tonic_textual.helpers.json_conversation_helper.JsonConversationHelper>` processes entire conversations in single NER calls, which improves performance, and then returns an NER result that still maps to your original JSON structure.
62
+
63
+
For example, the following JSON document represents a conversation:
Naively, we could process each speech utterance using our redact_json endpoint but we could lose context since each utterance would be run through our models independetly. Let's use the :class:`JsonConversationHelper<tonic_textual.helpers.json_conversation_helper.JsonConversationHelper>` to improve our results.
68
+
Naively, we could use the ``redact_json`` endpoint to process each speech utterance. However, we might lose context, because each utterance runs through our models independetly.
69
+
70
+
To improve the results, we'll use the :class:`JsonConversationHelper<tonic_textual.helpers.json_conversation_helper.JsonConversationHelper>`.
59
71
60
72
.. code-block:: python
61
73
@@ -79,7 +91,7 @@ Naively, we could process each speech utterance using our redact_json endpoint b
This yields the following redaction result below. Each piece of speech from the conversation is stored in its own element in the resulting array. The order of text in the response matches the order of text in the original conversation.
94
+
This produces the following redaction result. In the resulting array, each piece of speech from the conversation is stored in its own element. The order of the text in the response matches the order of text in the original conversation.
0 commit comments