954,510 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Dimensioning Strings

For one of my customers, I have to develop applications that communicate with each other via fixed-width files. In other words, they have developed a file format that is "generic" to a lot of operations. If I write a new utility, it is expected that the utility will be able to read, write, and parse this generic file. The file is fixed-width, with each field taking up a "hard-coded' number of bytes.

Of course I'll write a reusable class to handle the file i/o. My question is, what is the best approach to handling fixed-width fields?

Each field will have to be a property. All data is essentially string data. I could define a string per field, but how do I force the strings to be a specific length? All strings will be right-padded with spaces...

For example, there is a 20-byte "account number" field. It starts at byte 56 in the record, and always takes up 20-bytes in the record. If I pass in a 12-byte value, the string should hold 12 bytes followed by 8 spaces.

In C#, you cannot "DIM" a string to a pre-defined length, can you?

Should I use StringBuilder to dynamically build the record, appending the fields/string, then appending spaces equal to the required length of the field minus the actual length of the string?

Is there an elegant way to store the field names/lengths?

I'm envisioning a class called "GenericIO", with the fields as properties:

GenericIO.account = "123456789ABC";

The "setter" for that property would look of the required length of "account", perhaps stored in a 2-dimensional array. It would then pad-out the string, adding the requisite number of spaces, and would return the final formatted string. If the string is too long, the class would generate an error.

Is this a good approach? Other suggestions?

tgreer
Made Her Cry
Team Colleague
2,118 posts since Dec 2004
Reputation Points: 227
Solved Threads: 37
 

i would do this:

private string accountNumber;
private int accountNumberLength;
public string AccountNumber { 
     get 
     {
       return accountNumber;
     } 
     set
    {
     StringBuilder sb = new StringBuilder(value);
     while(sb.Length < accountNumberLength)
        sb.Add(" ");
     accountNumber = sb.ToString();
     }
}
campkev
Posting Pro in Training
484 posts since Jul 2005
Reputation Points: 14
Solved Threads: 19
 

That's precisely what I hoped to avoid: a loop-based method of padding out strings.

I think I'll use a Dictionary (.NET Framework 2.0) to hold field names-to-lengths. I can create space-filled strings for each field/property in the constructor. Then when a value is passed into a particular field, I'll need a quick, efficient way to "insert" the value, leaving any trailing spaces. I definitely do NOT want a loop.

tgreer
Made Her Cry
Team Colleague
2,118 posts since Dec 2004
Reputation Points: 227
Solved Threads: 37
 

on second thought, what about

private string accountNumber, filler = "                      ";//make filler as long as the field is supposed to be
private int accountNumberLength;
public string AccountNumber { 
     get 
     {
       return accountNumber;
     } 
     set
    {
      accountNumber = ((string)(value + filler)).Substring(0,accountNumberLength);  
}
}
campkev
Posting Pro in Training
484 posts since Jul 2005
Reputation Points: 14
Solved Threads: 19
 

I don't like that approach because it isn't self-documenting. Hard-coding the spaces would work, but another programmer could come along, not understand the reason for the spaces, and delete them. Or even just one of the spaces, accidentally, introducing a hard to find error.

What I've done: make a private string for each field. Make a private dictionary with the fieldnames as keys, the lengths as values. The class constructor will use a foreach to get each kvp (key-value-pair), and then "fill" the string using the .PadRight method.

Next, create the public strings and code the get/set, with the "set" procedure doing error-checking and proper padding of shorter values.

tgreer
Made Her Cry
Team Colleague
2,118 posts since Dec 2004
Reputation Points: 227
Solved Threads: 37
 

don't have 2.0 so not familiar with Dictionary, but essentially you are doing this:

private string accountNumber;//make filler as long as the field is supposed to be
private int accountNumberLength;
public string AccountNumber { 
     get 
     {
       return accountNumber;
     } 
     set
    {
      accountNumber = value.PadLeft(accountNumberLength, ' ');  
}
}


but instead of accountnumber and accountnumber length, you are using an array(Dictionary) for all the different fields. Do I have that right? Just trying to learn.

campkev
Posting Pro in Training
484 posts since Jul 2005
Reputation Points: 14
Solved Threads: 19
 

Not quite. Something like this:

using System;
using System.Collections.Generic;
using System.Text;
using System.Reflection;

namespace tgreer
{
	public class GenericIO
	{
		 private string _field1;  // 20 bytes
		 private string _field2;  // 5 bytes
		 private string _field3;  // 10 bytes

		 private Dictionary<string, int> _fieldLengths = new Dictionary<string, int>(3);

		public GenericIO()
		{
			// class constructor
			_fieldLengths.Add("_field1", 20);
			_fieldLengths.Add("_field2", 5);
			_fieldLengths.Add("_field3", 10);

			Type _myTypeObject = Type.GetType("GenericIO");
			MemberInfo[] _myMemberArray = _myTypeObject.GetMembers();

			foreach (KeyValuePair<string, int> kvp in _fieldLengths)
			{
				// this loop "initializes" each field with the proper number of spaces
			  
			}


		}
	}
}


I'm not done, of course, coding the foreach loop. The constructor uses Reflection on itself, so that it can use the strings in the Dictionary (basically, an optimized 2-dimensional array), to actually get the correpsonding string and set its value.

I need a way to pass a string "field1" to get the string variablefield1, and that's accomplished with reflection. Then I'll use PadRight to fill the string with the number of spaces it requires.

Yeah, I'm probably overcomplicating this... thanks for the discussion, though.

tgreer
Made Her Cry
Team Colleague
2,118 posts since Dec 2004
Reputation Points: 227
Solved Threads: 37
 

Hi, for padding you can use this:

string a = "123";
string b = String.Format("{0,-20}", a);

b is of length 20, containing "123" (padded to the left) and the rest is filled with spaces

Hope, this helped.

_r0ckbaer
Light Poster
45 posts since Dec 2005
Reputation Points: 13
Solved Threads: 7
 

Yes, it did. I can use that as I code the "get" procedures for each of the public properties, thanks.

Here's my solution so far. I'd like commments on this approach. Am I dramatically overcomplicating the issue? The goal again is for the class constructor to initiliaze certain public properties, filling them a certain amount of spaces.

The names of the properties, as well as the required "size" and/or length, are contained in a Dictionary.

If there is a more efficient and clear (self-documenting) way of doing this, I'd like to hear about it.

using System;
using System.Collections.Generic;
using System.Text;
using System.Reflection;

namespace TGREER
{
    public class GenericIO
    {
        // first all private/public fields
        
        private string _field1 = "";
        public string Field1
        {
            get { return _field1; }
            set { _field1 = value; }
        }
        private string _field2 = "";
        public string Field2
        {
            get { return _field2; }
            set { _field2 = value; }
        }
        private string _field3 = "";
        public string Field3
        {
            get { return _field3; }
            set { _field3 = value; }
        }

        StringBuilder workString = new StringBuilder();
        private Dictionary<string, int> _fieldLengths = new Dictionary<string, int>(3);

        public GenericIO()
        {
            // class constructor
            _fieldLengths.Add("Field1", 20);
            _fieldLengths.Add("Field2", 5);
            _fieldLengths.Add("Field3", 10);

            // use Reflection to use the strings in the Dictionary to retrieve the actual string object
            Type MyType = this.GetType();

            workString.Append(" ");

            foreach (KeyValuePair<string, int> kvp in _fieldLengths)
            {
                workString.Remove(0, workString.Length);
                workString.Insert(0, " ", kvp.Value);
                PropertyInfo Mypropertyinfo = MyType.GetProperty(kvp.Key.ToString());
                Mypropertyinfo.SetValue(this, workString.ToString(), null);
            }
        }
    }
}

I know it would be more efficient simply to define the private variables as space-filled strings:

private _field1 = " 							                    ";


The problem is maintainability. What if a slip of the fingers causes one of those spaces to be deleted?

Next step is to rewrite the Public "getter" procedures to maintain the trailing spaces and to generate an error if the value is too long.

tgreer
Made Her Cry
Team Colleague
2,118 posts since Dec 2004
Reputation Points: 227
Solved Threads: 37
 

The public properties looks like this:

private string _field1;
public string Field1
{
  get { return _field1; }
  set
  {
	 _size = -1 * _fieldLengths["Field1"];
	 _format = "{0," + _size.ToString() + "}";
	 _field1 = String.Format(_format, value);
  }
}


Works great.

tgreer
Made Her Cry
Team Colleague
2,118 posts since Dec 2004
Reputation Points: 227
Solved Threads: 37
 

Hi, i think you are definitely overcomplicating stuff here:
why don't you write sth like this in the ctor:

_field1 = new string(' ', 20);
_r0ckbaer
Light Poster
45 posts since Dec 2005
Reputation Points: 13
Solved Threads: 7
 

Doh!

For some reason I had convinced myself that I'd need to use Reflection... because I'd have to set only those properties that represent fields to space-filled strings, but not any other public properties. Somehow "Reflection" popped into my head, and prevented me from seeing the obvious: only set the properties you actually need to set...

I'm still using the Dictionary, so that the field lengths are only coded in one spot. The Contructor and the Public "setter" proc can both reference the Dictionary.

Thanks for your help.

tgreer
Made Her Cry
Team Colleague
2,118 posts since Dec 2004
Reputation Points: 227
Solved Threads: 37
 

Just so this thread doesn't make me look like a complete dope...

The class will be used in another application. That application parses a configuration file. That config file contains the name of the property to set, followed by an expression telling it how to find the value.

So, I had to use reflection anyway to turn the name of the field, from the config file, into the actual property to set. I just didn't need to use reflection in the class itself.

tgreer
Made Her Cry
Team Colleague
2,118 posts since Dec 2004
Reputation Points: 227
Solved Threads: 37
 

I like that.
If that is really you, your intresting too!

aacinc
Newbie Poster
2 posts since Oct 2009
Reputation Points: 10
Solved Threads: 0
 

You could also look at the string.PadRight method. It returns a new string of the given length, with spaces inserted to fill where required.

Ryshad
Nearly a Posting Virtuoso
1,307 posts since Aug 2009
Reputation Points: 512
Solved Threads: 246
 

This article has been dead for over three months

Post: Markdown Syntax: Formatting Help
You