Optimizing working with big data

Question

Dani 4,653 The Queen of DaniWeb

2 Months Ago

What are some best practices for optimizing memory management when working with large datasets?

I am tagging this topic both with php (because that is my language of choice, and the one I work with big data with) as well as c++ (because I know DaniWeb has a large low level c++ community that is well suited to being able to delve into this topic into depth, and because years ago when I focused on c++ myself, I was very focused on efficiency).

c++ data-science php

4 Contributors
4 Replies
183 Views
1 Month Discussion Span
Latest Post 1 Week Ago Latest Post by ThinkWriteGrow

seven7pillars commented: Great insights! Efficient memory management is crucial for handling large datasets. Techniques like data chunking, indexing, and using optimized data. +0

Salem 5,265 Posting Sage

2 Months Ago

My first thoughts would be

What is large? Are we in the TB range or mere handfuls of GB?
How often do you need to do this? Is it once a day, once a month, or just once.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Dani 4,653 The Queen of DaniWeb Administrator Featured Poster Premium Member · Answer 1 · 2025-04-07T18:24:40+00:00

What is large? Are we in the TB range or mere handfuls of GB?

For the sake of argument, let's use my use case and say dozens of gigs and millions of rows.

How often do you need to do this? Is it once a day, once a month, or just once.

For me, the most important is real-time read and write performance to tables with millions of rows that have high concurrency.

whackksports 0 Newbie Poster · Answer 2 · 2025-05-09T11:23:47+00:00

Utilizing efficient tools and techniques, such as data filtering, storage solutions, and analytics platforms, is essential for optimizing work with big data. Automating processes, leveraging cloud computing, and ensuring scalability are also required.

ThinkWriteGrow 0 Newbie Poster · Answer 3 · 2025-05-28T07:06:08+00:00

What are some best practices for optimizing memory management when working with large datasets?

I am tagging this topic both with php (because that is my language of choice, and the one I work with big data with) as well as c++ (because I know DaniWeb has a large low level c++ community that is well suited to being able to delve into this topic into depth, and because years ago when I focused on c++ myself, I was very focused on efficiency).

To optimize memory management with large datasets, start by using efficient data types to minimize memory usage. Load data in smaller chunks instead of all at once, and avoid unnecessary data copies by working with references or views. Tools like memory-mapped files or libraries such as Dask can help handle big data efficiently. Also, remember to clean up unused objects promptly to free up memory. These steps will keep your processing smoother and faster.