I want to represent 2^32 elements in a vector. In reality, I allocate about 500,000 elements and then everything else is there to be used by my program (for storage of data). So I have to have say..

[0-999999] = 0
[100000-599999] = info
[600000 +] = 0 <-really big

I have to be able to place some value into any of the elements that start out with a 0 in it (in our case). The problem is that that means I have to allot a huge amount of memory just to initialize my program. Is there a good way to generate this association to my vector without actually allocating space until it's actually needed?

So in otherwords, can I define elements of a vector without allocating memory for them? I want to be able to fill the vector with all the actual data I have at start-up but I then need to be able to assign vector element [lets say 3] to a value, then generate space in memory for it. This seems like it would require vector to not have consecutive element indexes.

Any ideas?

Anything you want to store is recorded in memory and that element has memory allocated to it.

You can say that the vector could have 500,000 but that will simply create a 500,000 size vector with the memory allocation as well. Vectors are ideally designed so that they can shrink and expand. If your vector size is static, switch to using a standard array. That will save *some* overhead.

The only way to do what you want is to fill it as necessary, but this means you won't be able to randomly access elements you haven't created yet.

You may want to look at redesigning how you hold your data.

That may be a good idea. As of now the large memory bank is required for my design as I am emulating main memory for a MIPS I processor I designed in verilog. The processor must be able to write to and call from all 2^32 elements in memory. At start-up the processor only requires the data that has been allocated but as a program is run with my processor access to all 2^32 elements is required (consider a program that copies a text document and places the text in a new document - this would require access to a large amount of memory.)

Perhaps I could create a table in which any accesses outside of the static data (of which it's location is already known) have their locations in memory (according to the memory call from the processor) tabulated along with the actual information being stored.

How I would do this is have 2 vectors, one for data and one for said data' location in "memory". So lets say my processor wants to write 1,023 into memory element 999,999 then write "abcd" into element 2,000. I could push_back into each vector giving me:

[U]data  / location[/U]
 1023  / 999999
"abcd" / 2000

Is there a better way to do this? Maybe a map.

Also, I get a segfault when I try to make an array that is 4294967296 elements large. Is there any way around this?

Well you're going beyond the maximum value of unsigned long (4,294,967,295), so no. Unless you can use 64bit...

Ok, you need to think about being able to separate out your variables, or, emulate it.

So when your processor asks for memory location 0xff8218273 or whatever, your application checks if that has been allocation, if not, pushes it onto the map using the address location as the key.

But even with this method you're going to have to be ridiculously careful. If you assign an int value at position 0x00008, that int will use 4 bytes. So the end of that memory location should be 0x0000C. Will you be able to handle a request for 0x0000A for example?

Another way is to implement a virtual memory file. Incredibly complex ^^ (not that the other methods aren't, unfortunately)

I don't think that any way you look at this is going to be simple.

std::vector is unsuitable for this. What you are looking for is std::map.

Agreed, this is a bit complicated either way. I could implement a page table (which is what I'm going to do when I design caches in my Verilog design) but for now I'm going to stick with a map. So, because all I need to be able to do is access any location and the data at that location, it doesn't really matter whether my data is stored sequentially or not. I can just dump my static data and instructions into a map with the address of each piece of data as the key, and then as I dynamically make additions to memory I just add a new element with the address as the key. Because addressing is handled by my processor PC I can rely on that value being correct.

I'm a bit confused by the overflow discussion though. The ISA will never allow more than 32 bits of data to be stored in one memory element at a time so why do I have to worry about this?

Because you're addressing it by map location.

There is, technically, nothing to stop you from accessing an address that is a mid point.

So you assign and int to 0x08. This represents byte 8. An integer is 4 bytes in length.
So I write 64,400 to byte 8. In real memory, this would now belong to memory locations 0x08, 0x09, 0x0A, 0x0B <-- 4 bytes.

In your map, you store 64400 into the map location 0x08. Ok, what if I then decided to store 70000 into location 0x09?

In your map, there is no key, this is an available location. So you store 70000 into 0x09, 0x0A, 0x0B, 0x0C. Now do you see the problem? You have data cross over.Say you have a bunch of characters. In real memory the string "abcd" and I saved them at location 0x10. This would put "a" -> 0x01, "b" -> 0x02, "c" -> 0x03, "d" -> 0x04, "/0" -> 0x05. I want to get character 3 from my array. So in real memory I would access 0x03 and get "c"

The map doesn't contain that memory location and throws back an error. Even worse, it DOES have that memory location and gives you back something completely wrong.

I hope that explained why it would be a problem ^^

The MIPS ISA includes a load half word, and even a load byte instruction, so in terms of it being a real problem (as long as it's agreed that this happens under the ISA) it isn't one. Fortunately the MIPS compiler will take care of this for me as far as I understand. The processor will in fact allot x08 x09 x0a and x0b for an integer value and the integer will be addressed using all 4 locations.

I think we're misunderstanding each other here.

From what I see, you want to hold a map of memory locations that correspond to what your processor needs.
That's fine, what I'm saying your problem is that although your compiler will allocate those 4 bytes, your map won't. You won't get memory errors or anything like that, it's just accessing it back and trying to access midpoints in the data when you go to the map and try and retrieve the data, it may not actually be there, because *you* as the programmer didn't save it there. You saved it in your single location.

Unless what happens is, you store a 4byte int as 4 seperate bytes in the map? Which is where I may be misunderstanding you.
Are you going to actually receive an int as 4 seperate bytes?

Hmm, that's a good question. If internally the processor must allocate multiple registers for some information then there is no problem, it will be "cognizant" of the distribution. Lets say that the processor places the addition of 00010 and 00010 which would be 00100. That's 2 + 2 = 4 but none of these values are "integers" in terms of having 4 bytes to represent the number. This value is then loaded into data memory (or the map) with the key 123000. If my map takes elements of type <int,int> what exactly is it that's being stored in location 123000? Is there anyway to take the raw binary data and load it into the map instead of an integer value?

Well at the lowest level your will be working with bytes.
1 byte = 8 bits.
So at the very least, your 2 + 2 = 4 would be 00000010 + 00000010 = 00000100
If they were stored as byte or char.
If they were stored as int you would have 32bits (which I'm not going to type) and as 8 bits = 1 byte, 32bits = 4 bytes.

One way which works in my head is to bitwise & by 255 on the int, then shift it right by 8 places.


char lower = (myInt & (char)255);
myInt >> 8;
char midLower = (myInt & (char)255);
myInt >> 8;
char midUpper = (myInt & (char)255);
myInt >> 8;
char upper = (myInt & (char)255);

You can do it in reverse then to get your int back to it's original form (using OR instead of AND)

This is only something that sounds logical in my head. I haven't tried this ^^

Lemme take a step back. So a map "element" is 1 byte long no matter what?

No sorry, I was skipping a step ^^

If you set your map to std::map<int, byte> then every element of your map has been allocated enough space to store 1 byte. As this maps better to what you sounded like you wanted to acheive

AH, I understand now. My conversion process is a bit strange in terms of my processor. I handle everything in ints outside of my verilog file. Everything done numerically happens on my processor though.

In that case, I'm out of ideas without being able to study the processor and your emulator.

You've drained me of my knowledge, and now I feel empty inside o.O ;)

Hehe, well, I'm pretty happy with how it seems to be working at the moment. My instruction calls are perfect (all million of them...) Thanks for the incite. I wouldn't have thought about the storage space allotted for map as closely so I'm sure that you'll have stopped me from making an annoying little error somewhere down the line.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.