You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/data-operations-manual/Tutorials/Monitoring-Data-Quality.md
+39Lines changed: 39 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -214,3 +214,42 @@ Sometimes it may take a long time for data to be transitioned to a newly created
214
214
5. The entity-organisation range should be assigned to the new organisation for any of the entity numbers which are now being used for the new organisation's records.
215
215
216
216
6.[Retire any endpoints](../../How-To-Guides/Retiring/Retire-endpoints.md) for the old organisation's provisions so they are no longer collected.
217
+
218
+
219
+
## De-duplication of conservation-area data
220
+
221
+
The purpose of this process is to ensure that duplicate data is not stored unnecessarily for the conservation-area dataset generated by an organisation which may have also been provided by Historic England(HE).
222
+
223
+
The steps required for this process:-
224
+
225
+
1. Run the add-data tasks for conservation-area dataset (making a note of how many entities were added in the lookup file).
226
+
227
+
2. Raise the pull-request(PR) and ensure that it has been merged into the main branch so that the duplicate entities are picked up by the expectation report on the following day.
228
+
229
+
3.`DO NOT` inform the organisation at this stage.
230
+
231
+
4. On Power BI navigate to the "Digital Planning" workspace then to the "Planning Data Monitoring" report from where you select the "Duplicate Conservation Area" page.(Link_[0])
232
+
233
+
5. Click on the reports TITLE in order for the options panel to appear to right hand side
234
+
235
+
6. Click on the three dots for the more options dropdown menu, from which you select "Export data" to download the output.
236
+
237
+
7. Open up the exported file to show the HE duplicate entites.
238
+
239
+
8. Filter on the message column for "complete_match" criteria
240
+
241
+
9. Filter on the entity_a_organisation.name column for the organisation Historic England and filter on the entity_b_organisation.name column for the organisation for which the data was added on the previous day (re:step 1)
242
+
243
+
10. Copy the entities in columns entity_a and entity_b
244
+
245
+
11. Prepare the data to be appended to the old-enity.csv located at Link_[1] in following format
246
+
where entity_a=old-entity and entity_b=entity
247
+
e.g. 44012512,301,44013703,,redirect Historic England duplicate to LPA entity,2025-08-28,
248
+
249
+
12.`Also DO NOT forget` to update the entity-organisation file located at Link_[2]
250
+
251
+
13. When this change is merged, check the PowerBI report to confirm the duplicate entities have been fixed.
0 commit comments