I am watching lectures on C++ from the stanford online courses website. I have a slight problem in understanding how memory is allocated for a string in C++. Basically when you declare a vector the constructor is called and memory is allocated for the array arr, numUsed and numAllocated are initialized.

The first question I have is how much memory is allocated initially? Is it 2 bytes or is it 3 bytes with memory for the null character?

Secondly, in the member function add, the statement arr[numUsed++] = s assigns the first string to arr[0] and the next string to arr[1]. Now how are the strings stored in memory? How can I access individual characters of arr[0] or arr[1](is it arr[0][0], arr[0][1]?)?

Thanks. I hope I am clear with my questions. I have pasted the code from the lecture below.

*  File: myvector.h
 *  ----------------
 *
 *  Created by Julie Zelenski on 2/22/08.
 *
 */
#ifndef _myvector_h
#define _myvector_h

#include "genlib.h"

class MyVector
{
 public:
	MyVector();
	~MyVector();
	
	int size();
	void add(string s);
	string getAt(int index);
	
  private:
	string *arr;
	int numUsed, numAllocated;
	void doubleCapacity();

};


#endif

/*
 *  File: myvector.cpp
 *  ------------------
 *
 *  Created by Julie Zelenski on 2/22/08.
 *
 */

#include "myvector.h"

MyVector::MyVector()
{	
	arr = new string[2];
	numAllocated = 2;
	numUsed = 0;
}

MyVector::~MyVector()
{
	delete[] arr;
}


int MyVector::size()
{
	return numUsed;
}

string MyVector::getAt(int index)
{
	if (index < 0 || index >= size())
		Error("Out of bounds");
	return arr[index];
}

void MyVector::add(string s)
{
	if (numUsed == numAllocated)
		doubleCapacity();
	arr[numUsed++] = s;
}

void MyVector::doubleCapacity()
{
	string *bigger = new string[numAllocated*2];
	for (int i = 0; i < numUsed; i++)
		bigger[i] = arr[i];
	delete[] arr;
	arr = bigger;
	numAllocated*= 2;
}

1. Initially the constructor has its own allocated memory (which sizeof(string*) + 2*sizeof(int), for arr (pointer to string) and numAllocated and numUsed (two integers)), then the constructor itself allocates memory for two string objects (size = 2*sizeof(string)). I cannot tell you how big each of there are because that is OS and PC specific, so for your environment, you can simply check it but writing a simple program which calculates the above two sizes, if you really want to know. But knowing the exact size is useless since you really cannot use it because the standard does not guarantee it in any way.

2. For the null-character stuff, this is internal to the class "string" and thus I have no idea and in 12 years of programming I have frankly never cared about it. This is what is called "abstraction" in C++.

3. The strings are stored in memory one after the other, but each "string" object holds some pointer to the actual character sequence (probably null-terminated, or not, again I don't know and I don't care, and so should you, because it is meant to be abstracted away from whoever uses class "string"). So to answer the question, to access a character of a string you do arr[1][2], for the third character of the second string in the array.

The above implementation is called a dynamic array because it allocated memory by a factor of 2 every time more space is needed, this highly reduces the amount of new memory allocations needed and thus, increases performance over time. However, for a vector of strings it is pretty useless because strings are allocating new memory all the time so that defeats the purpose of efficiency right from the get-go.

Hey thanks a lot. Just had another question, you said that when the constructor allocates memory for 2 string objects and the size it allocates is OS and PC specific, does it mean that like an integer which can be 4 bytes on some platforms, a string object usually has a "fixed" size associated with it. Also, as a follow-up is there a limit to the number of characters that make up a string? I doubt the string will keep resizing itself if we write past the memory allocated. Just asking out of curiousity.

Thanks a lot again.

Yeah, that's essentially what it's about for the size of string, because, I imagine (or know), that the string class must hold a pointer to the characters and that pointer is like an int in terms of rules about its length, it can be 4 or 8 (or even 2 or 1 on micro-controllers, well 1 is pretty improbable). But also, it can hold other stuff too, maybe an int for the length of the string such that it doesn't need to traverse the string up to the null-character every time you ask for its length.

No there is no limit to the length of a string, not that I know of, at least not for all practical purposes. Of course the ultimate max size of the amount of memory you have on the computer, but that goes without saying. I have, for example, in my earlier days read some huge files (megabytes long or more) into a single string without reaching the end. If you think you might reach the end, I believe you can do a "try-catch" clause on your string to catch the unavailable_resource exception, but I wouldn't bother with that.

Yes, the string will keep on resizing itself every time you change it. Of course, that is not specified in the standard, so a particular implementation of class "string" might have some other more efficient scheme like allocating more memory than it needs in anticipation of having to grow in size later (like a "dynamic array"). But again, that is abstracted by the standard in the sense this is not specified on purpose such that the implementer can implement it in any way it wishes to.

So general rule, "plan for the worst". Assume that strings are completely reallocated every time they are modified (which is probably not the case if you are not using the worst set of C++ standard libraries on the planet!), this way you will limit your use of strings as they are generally not efficient, if efficiency is a concern in your application.

lol ..@mike you have obviously not given many interviews. I know people are supposed to be using strings and forget about char arrays, pointers, null terminated etc, but that's probably the most sought after questions in an interview and they make you write all sorts of string reversal, copy,strchr, palindromes blah blah to test the pointer stuff and well I always find it so difficult to conjure up these in an interview.

so my advice to the OP, please do spend time understand char *, char [] etc :)

lol.. @Agni, you mean like a job interview? I don't do job interviews... I get job offers. hehe.. just kidding.. my last job interview was on writing artificial intelligence algorithms to optimize fuel-consumption in interplanetary missions, so they didn't really bother asking me a string reversal problem.

yes job interviews, what a nightmare.. but well you're on another world all together :) !!

OP sorry for this digression

I am preparing for interviews right now and am brushing up on these basics as I had a tough time interviewing at MS...lol

This question has already been answered. Start a new discussion instead.