Discard Microsoft.Data.SqlCLient.SqlMetaData.xml in favor of storing metadata internally#4362
Discard Microsoft.Data.SqlCLient.SqlMetaData.xml in favor of storing metadata internally#4362tetolv wants to merge 8 commits into
Conversation
…d of Microsoft.Data.SqlClient.SqlMetaData.xml
edwardneal
left a comment
There was a problem hiding this comment.
Thanks for this. I've made a first-pass review for discussion.
| "DUMMY" | ||
| ]), | ||
| new SqlCommandCollection("Users", 1, 1, null, null, "select uid, name as user_name, createdate, updatedate from sysusers where (name = @Name or (@Name is null))", | ||
| [new Restriction(1, "User_Name", "@Name")]), |
There was a problem hiding this comment.
Do we need to specify the sequence number here? I can't see a reason why we'd ever want to have duplicates.
There was a problem hiding this comment.
I'm not sure, I understand your question. The reason I left explicit restriction numbers there, is that according to documentation https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/common-schema-collections#restrictions , column RestrictionNumber is required and for compatibility with the original ordering of the restrictions, I made those numbers explicit. Of course they could be removed and resulting table would rely on the implicit restrictions order in array for each sql-based collection.
| private readonly string _populationString; | ||
| public readonly Restriction[] RestrictionParams; | ||
|
|
||
| public SqlCommandCollection(string collectionName, int numberOfRestrictions, int numberOfIdentifierParts, |
There was a problem hiding this comment.
If there are no circumstances where numberOfRestrictions is not restrictions.Length then we don't need to specify it explicitly.
There was a problem hiding this comment.
It's just that there are two brotherly columns NumberOfRestrictions and NumberOfIdentifierParts, which are explicitly required by documentation https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/common-schema-collections#metadatacollections and they have a direct dependency: NumberOfIdentifierParts could not be greater than NumberOfRestrictions. If we would separate them, by replacing explicit definition of NumberOfRestrictions with a length of restriction array, then this dependency would not be so obvious for anyone adding new query-based collections.
Also I wanted for metadata definition in code, to resemble as close as possible, metadata returned to the caller.
| // 7. Variable-length string and binary types: varchar, nvarchar, varbinary. | ||
| // 8. Miscellaneous fixed-length types: uniqueidentifier, sql_variant, timestamp | ||
|
|
||
| AddFixedNumericType(SqlDbType.TinyInt, isBestMatch: true); |
There was a problem hiding this comment.
We've lost a lot of metadata by replacing these with a single list; the goal of doing this was to make it difficult for DataTypes to drift from SqlClient's internal MetaType definitions.
Can we replace the list in SqlMetaDataFactory.cs with multiple constructors, and derive data from MetaType definitions in the same way?
There was a problem hiding this comment.
I have thought about unification with MetaData definition, and even initially had a second constructor for SqlMetaDataFactory.TypeMetaData which would use them, but very few fields would match directly one-to-one between the two definitions for each type, so I decided to go on with a flat table, where all type metadata are defined explicitly. I bear in mind, that in #3372 there was an additional proposal to make general unification of all datatype info in one place, but right now I thought it to be a task for later.
Right now I have settled on the intermediate solution, I initialize all types in the private static readonly TypeMetaData[] s_types with a constructor, that takes SqlDbType and derive from it all the type values, that have a direct one-to-one relation to database schema datatype values,
public TypeMetaData(SqlDbType dbType, ...)
{
// Shared type properties
MetaType metaType = MetaType.GetMetaTypeFromSqlDbType(dbType, isMultiValued: false);
TypeName = alias ?? metaType.TypeName;
ProviderDbType = (int)metaType.SqlDbType;
DataType = metaType.ClassType.FullName!;
// Schema specific type properties
..., and the rest are defined explicitly in s_types as before.
The reason I haven't taken many-constructor approach is, because there were constructors, used by several datatypes, but inside a constructor there were conditional logic based on particular datatype being constructed. Not only this logic was hard to follow to create a complete mental picture of what properties of constructed datatype would be, but also adding a new data type would anyway require either adding a new dedicated constructor, or additionally enhance inner logic of one of the existing constructors.
When datatype metadata unification will eventually be made, then I expect that datatype class will have all the fields required by all datatype representations that we have, and optionally perhaps a set methods: ToMetaType, ToSchemaRow, etc..., to produce particular type representations where needed.
There was a problem hiding this comment.
I have additionally updated my version, so now almost everything about data types is derived from Microsoft.Data.SqlClient.MetaType, except for MaximumScale property for date, datetime2, datetimeoffset.
| @@ -237,14 +230,8 @@ | |||
| <EmbeddedResource Include="Resources\ILLink.Substitutions.xml" | |||
There was a problem hiding this comment.
The file also needs to be deleted.
There was a problem hiding this comment.
Do you really mean ILLink.Substitutions.xml? This file is not related to my changes is any way. Or you mean Microsoft.Data.SqlClient.SqlMetaData.xml should be physically removed from repo, not just it's inclusion in project file.
…tions for all query versions
…f certain collection, or type is supported, because Azure SQL always returns version 12.0.XXXX.
|
I have several more questions:
|
| } | ||
| }; | ||
|
|
||
| DataRow row = table.NewRow(); |
There was a problem hiding this comment.
👍 Somehow Visual Studio haven't highlighted that unused code to me, so I haven't noticed.
|
/azp run |
|
Azure Pipelines successfully started running 3 pipeline(s). |
Description
Continuation of @paulmedynski effort on #3372 to pull inside one by one all metadata collections from Microsoft.Data.SqlCLient.SqlMetaData.xml. Here it led me to complete rewrite of the
SqlMetaDataFactory. Main reasons were:DataTableas cached representation of metadata, with dedicated classes, because they require less checks for data consistency, they should consume less memory and we don't requireDataTablequerying capabilities anyways_metaDataCollection, which in most cases should be the only place, where new metadata additions and tweaks should be performed, without need to dig deep into the code, which handles this data structure. There I have diverted from the code style guidelines to use longer lines, because that structure by nature has rather tabular representation and it would never fit into 100 characters, but I think that maintaining it's tabular representation is more important to code readability.Fixes
sys.table_types, used in query for StructuredTypeMembers metadata collection was supported since SQL2008(v10.0), not since SQL2005(v9.0). I have updated minimumVersion accordingly.Localization
MDF_DataTableDoesNotExistneed to be rephrased and translated. Original text was The collection '{0}' is missing from the metadata XML., but obviously it is not valid anymore.Testing
I have compared metadata returned from rewritten version of
SqlMetaDataFactorywith metadata returned from MDS v7.0.1. I have attached comparison results inSchemaComparison.zip.SchemaComparison.zip It has the following structure:
--- config: treeView: rowIndent: 10 lineThickness: 1 themeVariables: treeView: labelFontSize: '15px' labelColor: '#808080' lineColor: '#606060' --- treeView-beta "09.00.5000 (MSSQL 2005)" "legacy" "multiple .csv files for each collection." "new" "multiple .csv files for each collection" "12.00.9114 (Azure database)" "legacy" "multiple .csv files for each collection" "new" "multiple .csv files for each collection" "17.00.4015 (MSSQL 2025)" "legacy" "multiple .csv files for each collection" "new" "multiple .csv files for each collection"Here's a list of all notable differences:
ReservedWordscollection in all versions (09.00.5000, 12.00.9114, 17.00.4015) all elements are rearranged, but no new items appears, or were removed. Rearrangement is the same as in #PR4328, because I took a list of reserved words from there.Restrictionscollection in all versions, there exists a columnRestrictionDefault. But in https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/common-schema-collections it is marked as Ignored. Previously for all the metadata collections, which were based on query to server system views, this column contained name of the column in system view, which was used by the restriction, but it was not always the case. For now I have decided to do not return anything inRestrictionDefaultcolumn, but if it's deemed important, then I can return there the same values as before.Restrictionscollection, filtering by server engine version wasn't working correctly, so on v09.00.5000 restrictions for collectionsAllColumns,ColumnSetColumns,StructuredTypeMemberswere returned, although collections itself are not supported on that engine version.MetaDataCollectionsthere are two new collectionsTVPsandUDTs. Those are used by DataTypes collection, where in addition to static data types, there are also returned table-valued parameter types, and user-defined types, that are read from the server system views. Previously there were separate methods for querying each of them and appending to the DataTypes collection, but now they are handled by separateSqlCommandCollectioninstances added to thes_metaDataCollection. As result are also returned inMetaDataCollections. Those two could either remain as undocumented collections, or I can somehow mark them as internal, so they wouldn't be returned in MetaDataCollections.No automatic tests were added, because affected functionality is completely private.
Will be happy to hear any feedback.