With a selection of schemas tested and a set of CodeDom tests established for the checks we're most likely to need in [](http://) part 2, let's turn our attention to the tweaks we want to make to the generated CodeNamespace members. Here's my initial list - we could well add to it later on providing more tests are written to cover the additional code.

  1. Change the field \ property names from elementField\Element to element\Element
  2. Change the type of a field \ property
  3. Change the name of a type 
  4. Change the way dates are serialized.
  5. Change the way boolean values are serialized.

In this post, we'll build the code that makes the first these changes.

It's a bit odd that Microsoft have chosen to autogenerate these with names which don't satisfy the usual naming conventions. For example, for a schema element called Name, you would expect a property called Name backed up with a field called name or _name rather than nameField. Hey ho, that's one of the reasons we are writing this app - to generate the code we're comfortable using. So let's look at the typical Property\Field combo generated by xsd.exe

private string stringElementField;
public string StringElement {
get {
return this.stringElementField;
}
set {
this.stringElementField = value;
}
}

Let's start with the basics and write a couple of methods and tests to change a field's name. One assumes the xsd generated style name and the other not.

[Test]
public void ChangeFieldName_ExpectedFieldName_
LowerCasesFirstLetterAndRemovesFieldSuffix()
{
string original = "StringElementField";
string newName = sc.ChangeFieldName(original);
Assert.AreEqual("stringElement", newName, 
"ChangeFieldName has not worked as expected");
}
[Test]
[ExpectedException(
typeof(Tools.XsdGenerator.UnexpectedFieldNameException), 
"The current field name is not the default xsd field name.")]
public void ChangeFieldName_
UnexpectedFieldName_RaisesUnexpectedFieldNameException()
{
string original = "StringElement";
string newName = sc.ChangeFieldName(original);
}

We have to create a stub for ChangeFieldName() and the UnexpectedFieldNameException class before anything compiles and we have two failing tests. Excellent. Now to implement them. UnexpectedFieldNameException simply inherits all its functionlaity from FormatException, so I won't list it here.

public string ChangeFieldName(string original)
{
if (IsXsdGeneratedFieldName(original))
{
//Lower case the first letter of the name
string newName = FirstLetterToLowerCase(original);
//Remove Field (last five characters from the name)
newName = newName.Substring(0, newName.Length - 5);
return newName;
}
else
{
throw new UnexpectedFieldNameException(
"The current field name is not the default xsd field name.");
}
}

Now what about the Property name? All our tests have established is that a schema element name is reflected in the property, including case. So if your schema reads

<?xml version="1.0" encoding="Windows-1252" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="root">
<xs:complextype>
<xs:sequence>
<xs:element name="stringElement" type="xs:string"></xs:element>
<xs:element name="DateElement" type="xs:dateTime"></xs:element>
</xs:sequence>
</xs:complextype>
</xs:element>
</xs:schema>

You'll get field property\combos called stringElementField\stringElement and dateElementField\DateElement. We'll need to make sure to capitalise the first letter of the property's name just in case.

public string ChangePropertyName(string original)
{
return FirstLetterToUpperCase(original);
}

As ever, it's worth remembering here that the idea of TDD is to code only as much as I need to pass the tests I've written so far. If I want to rename fields or properties in a different way later on, I'll refactor or overload a method as is required.

As an aside, FirstLetterToLowerCase and FirstletterToUpperCase would work well as string Extension methods in C# 3.0. For example,

  • original.FirstLetterToUpperCase();
  • original.FirstLetterToLowerCase();

One for review if xsd.exe changes again with .NET BCL vNEXT.

OK, we can change the names. Now let's think about the consistency part of it. One property is linked to one field and that pair is located within a single type. So as long as we iterate only within a type we should be able to

  • find a property,
  • change the property's name
  • query the property's getter and setter for the field it refers to
  • store the original field name
  • change it in the get\set pair
  • find the field in the type that it refers to
  • change the field name.

Lets start writing a test. We'll need a type containing a field\property combination as if it had just come out the  ConvertSchemaToCodeNamespace() method. Fortunately, because we took our time to figure out what to test for while writing tests for that method (see the screenshots of the watch window in part two of this series), we know how to use System.CodeDom to construct the type\field\property construct we need. Once we've created the type, our test is only three further lines long thanks to our CodeDomAssert class we contructed previously.

[Test]
public void ChangeFieldPropertyNameCombo_
SingleCombo_NamesGetsAndSetsChangedCorrectly()
{
//Create a field
CodeMemberField field = new CodeMemberField("System.String", 
DefaultGeneratedFieldName("myProperty"));
//Create a reference for a field
CodeFieldReferenceExpression fieldRef = 
new CodeFieldReferenceExpression
(new CodeThisReferenceExpression(), 
DefaultGeneratedFieldName("myProperty"));
//Create a simple return statement 'return field;'
CodeMethodReturnStatement returnStatement = 
new CodeMethodReturnStatement(fieldRef);
//Create a simple set statement 'field = value;'
CodeAssignStatement assignStatement = 
new CodeAssignStatement
(fieldRef, new CodePropertySetValueReferenceExpression());
//Create a property
CodeMemberProperty property = new CodeMemberProperty();
property.Name = "myProperty";
property.GetStatements.Add(returnStatement);
property.SetStatements.Add(assignStatement);
property.HasSet = true;
property.HasGet = true;
property.Type = new CodeTypeReference("System.String");
//Create a type
CodeTypeDeclaration type = 
new CodeTypeDeclaration(rootTypeName);
type.IsClass = true;
type.IsPartial = true;
type.Members.Add(property);
type.Members.Add(field);
//Now convert the property field combination
sc.ChangeFieldPropertyNameCombo(type, "myProperty");
//Test the field has been renamed correctly
CodeDomAssert.AssertFieldExists(
type.Members, MemberAttributes.Public, "System.String", "myProperty");
//Test the property has been renamed correctly and 
//the field has renamed correctly as well in the getter and setter.
CodeDomAssert.AssertPropertyExists(type.Members, "MyProperty", 
"System.String", true, true, "myProperty");
}

We'll refactor the CodeDom code later as we are sure to reuse it. Again, it would be really nice to create ChangeFieldPropertyNameCombo() as an extension method for the CodeTypeDeclaration class as you only ever look for the combos within a single Type, but as we don't have that option in .NET 2.0, we have to include the Type object in the call as well. It occurs to me while writing ChangeFieldPropertyNameCombo() that my strategy is wrong here. I'm calling this code assuming that the Field relies on the Property and thus the names do too. But the ChangeFieldName() and ChangePropertyName() methods do reflect that. I actually need to derive the new field name from the new Property name and change my code accordingly. The default assumption is that it will be the same as the property name but with the first letter in lower case. Hence the call to a new function GetFieldBasedOnPropertyName() which currently just calls FirstLetterToLowerCase().

public void ChangeFieldPropertyNameCombo(
CodeTypeDeclaration type, CodeMemberProperty property)
{
//Local variable to store the field name corresponding 
//to the property and the one to change it to.
string originalFieldName = "";
string newFieldName = "";
//Change the property's name 
property.Name = GetNewPropertyName(property.Name);
//Get the new field name based on the new property name
newFieldName = 
GetFieldNameBasedOnPropertyName(property.Name);
//Check GetStatement if there is one. 
//Assumes it looks like "return fieldname"
if (property.HasGet)
{
//A simple get statement just returns the field 
CodeMethodReturnStatement thisGetter = 
property.GetStatements[0] as CodeMethodReturnStatement;
//But you need to get the Reference to the field from the 
//return statement to see which field is being returned.
CodeFieldReferenceExpression fieldGotten = 
thisGetter.Expression as CodeFieldReferenceExpression;
//store original field name for reference
originalFieldName = fieldGotten.FieldName;
fieldGotten.FieldName = newFieldName;
}
//Check SetStatement if there one. 
//Assumes it looks like "fieldname = value;". 
//Assumes getter and setter have same field name
if (property.HasSet)
{
//A set statement assigns values,
// so we need to find a CodeAssignStatement
CodeAssignStatement thisSetter = 
property.SetStatements[0] as CodeAssignStatement;
//Get left side of statement. 
//Should be a reference to the private field
CodeFieldReferenceExpression fieldBeingSet = 
thisSetter.Left as CodeFieldReferenceExpression;
//If property is write only, there will be no getter, 
//so we need to get originalFieldName before changing it
if (IsNullOrEmpty(originalFieldName))
{
originalFieldName = fieldBeingSet.FieldName;
}
fieldBeingSet.FieldName = newFieldName; 
}
//If not getter or setter, there is no originalFieldName 
//to search for, so don't look for a field
//If there is, do
if (!IsNullOrEmpty(originalFieldName))
{
foreach (CodeTypeMember member in type.Members)
{
if (member.GetType() == typeof(CodeMemberField))
{
CodeMemberField field = member as CodeMemberField;
if (field.Name == originalFieldName)
{
field.Name = newFieldName;
}
}
}
}
}

One of the corollaries to the TDD rule "Write only the code that passes the test" seems to be "Don't get ahead of yourself". Maybe Oren's idea should be rephrased as "Code Little While Thinking Little, Think some more, Repeat" (hey, I refactored his idea). Anyway, it would seem I have got ahead of myself and catered for the case where a property doesn't have both get and set statements by isolating the pertinent blocks of code in if statements. I did add in these tests next though for completeness. I may have to write other tests which check what happens if the gets and sets aren't in the standard simple 'return field' and 'field = value' form, but that's for later. At the moment, I'm guessing that our conversion method only returns gets and sets of that form.

I did a little refactoring at this point to isolate the CodeDom manipulation tests from the schema conversion ones too and pull out commonly used strings. They are now in a different class. Maybe I should reflect this in my program code too. Right or wrong, I've also deleted ChangeFieldName() and the tests that test it in preference to GetFieldBasedOnPropertyName().

Now let's see if our code holds for a type containing two property field combos.

[Test]
public void ChangeFieldPropertyNameCombo_
TwoCombos_NamesGetsAndSetsChangedCorrectly()
{
//Create standard type with one property \ field combination
CodeMemberProperty property1;
CodeTypeDeclaration type1 = 
CreateTypeWithSimplePropertyFieldCombo(
type1Name, property1Name, stringType, out property1);   
//Create and add second combo to type
string field2Name = 
HelperFunctions.DefaultXsdGeneratedFieldName(property2Name);
CodeMemberField field2 = CreateField(field2Name, stringType);
CodeMemberProperty property2 = 
CreateSimpleGetSetProperty(property2Name, field2Name, stringType);
type.Members.Add(property2);
type.Members.Add(field2);
//Now convert the property field combination
sc.ChangeAllFieldPropertyNameCombosInType(type);
//Test the field has been renamed correctly
CodeDomAssert.AssertFieldExists(type.Members, 
MemberAttributes.Private, stringType, expectedField1Name);
CodeDomAssert.AssertFieldExists(type.Members, 
MemberAttributes.Private, stringType, expectedField2Name);
//Test the property has been renamed correctly and the field 
//has renamed correctly as well in the getter and setter.
CodeDomAssert.AssertPropertyExists(type.Members, 
property1Name, stringType, true, true, expectedField1Name);
CodeDomAssert.AssertPropertyExists(type.Members, 
property2Name, stringType, true, true, expectedField2Name);
}

The ChangeAllFieldPropertyNameCombosInType() just looks for all the properties in the Type's members collection and calls ChangeFieldNamePropertyCombo() based on that.

public void ChangeAllFieldPropertyNameCombosInType(
CodeTypeDeclaration type)
{
foreach (CodeTypeMember member in type.Members)
{ 
//Is this member a property?
if (member.GetType() == typeof(CodeMemberProperty))
{
CodeMemberProperty property = 
member as CodeMemberProperty;
ChangeFieldPropertyNameCombo(type, property);
}
}
}

And the test passes first time. Fantastic. Let's take one more step up and create two types in a namespace, each of which has a property field combination and see if our code continues to hold up. Our new test looks quite similar to the previous one. Finally, they are getting easier to write. Maybe the "hump" people talk about when writing code using TDD has been crested?

[Test]
public void ChangeAllFieldPropertyNameCombosInNamespace_
TwoTypes_NamesGetsAndSetsChangedCorrectly()
{
//Create standard type with one property \ field combination
CodeMemberProperty property1;
CodeTypeDeclaration type1 = 
CreateTypeWithSimplePropertyFieldCombo(type1Name, 
property1Name, stringType, out property1);
//Create second type with one property \ field combination
CodeMemberProperty property2;
CodeTypeDeclaration type2 = 
CreateTypeWithSimplePropertyFieldCombo(type2Name, 
property2Name, stringType, out property2);
//Create namespace and add types to collection
CodeNamespace ns = new CodeNamespace(namespaceName);
ns.Types.Add(type1);
ns.Types.Add(type2);
sc.ChangeAllFieldPropertyNameCombosInNamespace(ns);
//Test the field has been renamed correctly
CodeDomAssert.AssertFieldExists(type1.Members, 
MemberAttributes.Private, stringType, expectedField1Name);
CodeDomAssert.AssertFieldExists(type2.Members, 
MemberAttributes.Private, stringType, expectedField2Name);
//Test the property has been renamed correctly and the 
//field has renamed correctly as well in the getter and setter.
CodeDomAssert.AssertPropertyExists(type1.Members, 
property1Name, stringType, true, true, expectedField1Name);
CodeDomAssert.AssertPropertyExists(type2.Members, 
property2Name, stringType, true, true, expectedField2Name);
}

The code is pretty much as you'd expect - we need to iterate through the collection of types in the namespace and call ChangeAllFieldPropertyNameCombosInType().

public void ChangeAllFieldPropertyNameCombosInNamespace
(CodeNamespace ns)
{
foreach (CodeTypeDeclaration type in ns.Types)
{ 
ChangeAllFieldPropertyNameCombosInType(type);
}
}

Our test passes no problem.

Like the cases though where there are no sets or no gets in a property, we have to look at a couple of other 'boundary conditions' at this level. CodeTypeDeclarations can represent Classes, Structs, Enums and Interfaces. We don't know if xsd.exe produces any other of these four types in addition to classes, but it would make no sense to look for property\field combinations in enums or interfaces as they don't exist. So we need to check ChangeAllFieldPropertyNameCombosInType() throws an Exception if it ever receives an Enum or Interface and tweak ChangeAllFieldPropertyNameCombosInNamespace() so it doesn't pass them on by mistake. Here's the test to check for the enum error. The interface test is basically identical and the changes to ChangeAllFieldPropertyNameCombosInType() can be found in a little bit later on in this post.

[Test]
[ExpectedException(typeof(System.InvalidOperationException), 
@"There are no Property\Field pairs to convert in an Enum")]
public void ChangeAllFieldPropertyNameCombosInType_
TypeIsEnum_ThrowsInvalidOperationException()
{
CodeTypeDeclaration type = 
new CodeTypeDeclaration(type1Name);
type.IsEnum = true;
sc.ChangeAllFieldPropertyNameCombosInType(type);
}

One more question. What if a schema looked like this.

<?xml version="1.0" encoding="Windows-1252" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="root">
<xs:complextype>
<xs:sequence>
<xs:element name="stringElement" type="xs:string"></xs:element>
<xs:element name="StringElement" type="xs:dateTime"></xs:element>
</xs:sequence>
</xs:complextype>
</xs:element>
</xs:schema>

Now it is daft and poor naming, but it happens and xsd deals with it like so

  • The string stringElement gets turned into the combo stringElementField\stringElement
  • The dateTime StringElement gets turned into the combo stringElementField1\StringElement

So then, we have a potential problem. Our current routine could give fields and properties the same name - in the above example both combos would get renamed stringElement\StringElement. This is subtle enough that we probably wouldn't have spotted it until trying to compile the code that it had generated. It's a bug I encountered in the .NET 1.0 verison of this app though, so I knew about it in advance.

Incidentally, this problem also validates the decision to choose a field name based on the property name and not what we would expect xsd to call it. stringElementField1 would have been rejected as an invalid Xsd-generated field name.

If xsd suffixes names with numbers, maybe we should suffix potential clashing names with letters instead. As long as we can name the properties inside a type uniquely, the field names will follow suit. Taking the same example, let's have the results look like this.

  • The string stringElement gets turned into the combo stringElement\StringElement
  • The dateTime StringElement gets turned into the combo stringElementA\StringElementA

It's sufficient but not pretty. We can blame the schema writer for that. :)

Let's write a test. It will be exactly the same as ChangeAllFieldPropertyNameCombosInType_TwoCombos_NamesGetsAndSetsChangedCorrectly() but with different values plugged into the second field and property name and the expected results. Run it and it fails because it finds two fields called myProperty. So somewhere in our code we need to keep a running track of the new property names being used in a type as we go along. Then GetNewPropertyName() can check in the list for the proposed new property name and if it exists can add the suffix.

So how to implement the list? An ArrayList seems like a good idea for now. That means GetNewPropertyName() should now look like this given NewPropertynames is an ArrayList.

public string GetNewPropertyName(string original)
{
string newName = FirstLetterToUpperCase(original); 
// Is newName one of the property names already processed?
// If so, suffix newName with an A
if (NewPropertyNames.IndexOf(newName) > -1)
{
newName += 'A';
} 
// Add newName to the current list of property names.
NewPropertyNames.Add(newName); 
return newName;
}

Or does it? NewPropertyNames could be set as a public property to the class but it only has relevance within a type so it should be created within ChangeAllFieldPropertyNameCombosInType() and passed \ made available down the chain to the point where a new name can be added. A public property is easier to implement but you'd need to be careful to empty the ArrayList after working through each type to prevent false positive results for other types down the road. Creating the ArrayList in ChangeAllFieldPropertyNameCombosInType() means that we always start from scratch with each new type.

TDD principles state that the minimum should be done to pass the test and leave the others all green, so let's code it using this second method and we can change things around later if required. Here's the first attempt. The ArrayList is created in ChangeAllFieldPropertyNameCombosInType() and passed onwards. (You can also see here the check's we added in to throw Exceptions to fulfil the tests mentioned earlier)

public void ChangeAllFieldPropertyNameCombosInType(
CodeTypeDeclaration type)
{
if (type.IsEnum)
{
throw new InvalidOperationException(
@"There are no Property\Field pairs to convert in an Enum");
}
else if (type.IsInterface)
{
throw new InvalidOperationException(
@"There are no Property\Field pairs to convert in an Interface");
}
else // type must therefore be either Class or Struct
{
//Create arraylist to store propertynames 
//generated within this type and so prevent renaming clashes
ArrayList newPropertyNames = new ArrayList(); 
foreach (CodeTypeMember member in type.Members)
{
//Is this member a property?
if (member.GetType() == typeof(CodeMemberProperty))
{
CodeMemberProperty property = 
member as CodeMemberProperty;
ChangeFieldPropertyNameCombo(
type, property, newPropertyNames);
}
}
}
}

All this new overload of ChangeFieldPropertyNameCombo() does different from the one already established is generate the new property name differently by checking for clashes with those already in the list. So a bit of refactoring means all we need are three overloads for ChangeFieldPropertyNameCombo() as follows

  • private void ChangeFieldPropertyNameCombo(type, property, propertyName) which does all the work
  • public void ChangeFieldPropertyNameCombo(type, property, newPropertyNames) which generates the new property name keeping in mind the list of already created names
  • public void ChangeFieldPropertyNameCombo(type, property) which generates the new property name without keeping in mind the list of already created names

Finally, we can revert GetPropertyName(currentName) to its original form and introduce the following overload that takes the ArrayList of names as well

public string GetNewPropertyName(string original)
{
string newName = GetNewPropertyName(original); 
// Is newName one of the property names already processed?
// If so, suffix newName with an A
while (NewPropertyNames.IndexOf(newName) > -1)
{
newName += 'A';
} 
// Add newName to the current list of property names.
NewPropertyNames.Add(newName); 
return newName;
}

Code compiles and all test are green. "Proper job", as some folks say. XSD generates code in one namespace only, so we won't look at looping across multiple namespaces in a CodeDom.CompileUnit which would be the next level up, but it's a possibility. nCover reports 98% test coverage of our current program code. There are 1020 lines of test code against 330 lines of live code as well so the ratio of test:live has improved as well from 7:1 to 3:1. The next logical thing to do would be to plug our current pieces of code together and get a working tool. I foresee questions like 'how do I test a private method' and 'how do I test the end result - a class file' ahead.