@@ -10,16 +10,204 @@ Data Serialization
1010What is data serialization?
1111***************************
1212
13- Data serialization is the concept of converting structured data into a format
13+ Data serialization is the process of converting structured data into a format
1414that allows it to be shared or stored in such a way that its original
15- structure to be recovered. In some cases, the secondary intention of data
15+ structure should be recovered or reconstructed . In some cases, the secondary intention of data
1616serialization is to minimize the size of the serialized data which then
1717minimizes disk space or bandwidth requirements.
1818
19+ ********************
20+ Flat vs. Nested data
21+ ********************
1922
20- ******
21- Pickle
22- ******
23+ Before beginning to serialize data, it is important to identify or decide how the
24+ data should be structured during data serialization - flat or nested.
25+ The differences in the two styles are shown in the below examples.
26+
27+ Flat style:
28+
29+ .. code-block :: python
30+
31+ { " Type" : " A" , " field1" : " value1" , " field2" : " value2" , " field3" : " value3" }
32+
33+
34+ Nested style:
35+
36+ .. code-block :: python
37+
38+ {" A"
39+ { " field1" : " value1" , " field2" : " value2" , " field3" : " value3" } }
40+
41+
42+ For more reading on the two styles, please see the discussion on
43+ `Python mailing list <https://mail.python.org/pipermail/python-list/2010-October/590762.html >`__,
44+ `IETF mailing list <https://www.ietf.org/mail-archive/web/json/current/msg03739.html >`__ and
45+ `in stackexchange <https://softwareengineering.stackexchange.com/questions/350623/flat-or-nested-json-for-hierarchal-data >`__.
46+
47+ ****************
48+ Serializing Text
49+ ****************
50+
51+ =======================
52+ Simple file (flat data)
53+ =======================
54+
55+ If the data to be serialized is located in a file and contains flat data, Python offers two methods to serialize data.
56+
57+ repr
58+ ----
59+
60+ The repr method in Python takes a single object parameter and returns a printable representation of the input:
61+
62+ .. code-block :: python
63+
64+ # input as flat text
65+ a = { " Type" : " A" , " field1" : " value1" , " field2" : " value2" , " field3" : " value3" }
66+
67+ # the same input can also be read from a file
68+ a = open (' /tmp/file.py' , ' r' )
69+
70+ # returns a printable representation of the input;
71+ # the output can be written to a file as well
72+ print (repr (a))
73+
74+ # write content to files using repr
75+ with open (' /tmp/file.py' ) as f:f.write(repr (a))
76+
77+
78+ ast.literal_eval
79+ ----------------
80+
81+ The literal_eval method safely parses and evaluates an expression for a Python datatype.
82+ Supported data types are: strings, numbers, tuples, lists, dicts, booleans, and None.
83+
84+ .. code-block :: python
85+
86+ with open (' /tmp/file.py' , ' r' ) as f: inp = ast.literal_eval(f.read())
87+
88+ ====================
89+ CSV file (flat data)
90+ ====================
91+
92+ The CSV module in Python implements classes to read and write tabular
93+ data in CSV format.
94+
95+ Simple example for reading:
96+
97+ .. code-block :: python
98+
99+ # Reading CSV content from a file
100+ import csv
101+ with open (' /tmp/file.csv' , newline = ' ' ) as f:
102+ reader = csv.reader(f)
103+ for row in reader:
104+ print (row)
105+
106+ Simple example for writing:
107+
108+ .. code-block :: python
109+
110+ # Writing CSV content to a file
111+ import csv
112+ with open (' /temp/file.csv' , ' w' , newline = ' ' ) as f:
113+ writer = csv.writer(f)
114+ writer.writerows(iterable)
115+
116+
117+ The module's contents, functions, and examples can be found
118+ `in the Python documentation <https://docs.python.org/3/library/csv.html >`__.
119+
120+ ==================
121+ YAML (nested data)
122+ ==================
123+
124+ There are many third party modules to parse and read/write YAML file
125+ structures in Python. One such example is below.
126+
127+ .. code-block :: python
128+
129+ # Reading YAML content from a file using the load method
130+ import yaml
131+ with open (' /tmp/file.yaml' , ' r' , newline = ' ' ) as f:
132+ try :
133+ print (yaml.load(f))
134+ except yaml.YAMLError as ymlexcp:
135+ print (ymlexcp)
136+
137+ Documentation on the third party module can be found
138+ `in the PyYAML Documentation <https://pyyaml.org/wiki/PyYAMLDocumentation >`__.
139+
140+ =======================
141+ JSON file (nested data)
142+ =======================
143+
144+ Python's JSON module can be used to read and write JSON files.
145+ Example code is below.
146+
147+ Reading:
148+
149+ .. code-block :: python
150+
151+ # Reading JSON content from a file
152+ import json
153+ with open (' /tmp/file.json' , ' r' ) as f:
154+ data = json.load(f)
155+
156+ Writing:
157+
158+ .. code-block :: python
159+
160+ # Writing JSON content to a file using the dump method
161+ import json
162+ with open (' /tmp/file.json' , ' w' ) as f:
163+ json.dump(data, f, sort_keys = True )
164+
165+ =================
166+ XML (nested data)
167+ =================
168+
169+ XML parsing in Python is possible using the `xml ` package.
170+
171+ Example:
172+
173+ .. code-block :: python
174+
175+ # reading XML content from a file
176+ import xml.etree.ElementTree as ET
177+ tree = ET .parse(' country_data.xml' )
178+ root = tree.getroot()
179+
180+ More documentation on using the `xml.dom ` and `xml.sax ` packages can be found
181+ `in the Python XML library documentation <https://docs.python.org/3/library/xml.html >`__.
182+
183+
184+ *******
185+ Binary
186+ *******
187+
188+ =======================
189+ NumPy Array (flat data)
190+ =======================
191+
192+ Python's NumPy array can be used to serialize and deserialize data to and from byte representation.
193+
194+ Example:
195+
196+ .. code-block :: python
197+
198+ import NumPy as np
199+
200+ # Converting NumPy array to byte format
201+ byte_output = np.array([ [1 , 2 , 3 ], [4 , 5 , 6 ], [7 , 8 , 9 ] ]).tobytes()
202+
203+ # Converting byte format back to NumPy array
204+ array_format = np.frombuffer(byte_output)
205+
206+
207+
208+ ====================
209+ Pickle (nested data)
210+ ====================
23211
24212The native data serialization module for Python is called `Pickle
25213<https://docs.python.org/2/library/pickle.html> `_.
0 commit comments