PDA

View Full Version : How to achieve fastest database insertion



karankumar1609
18th November 2013, 11:53
Hello Everyone,

I am working with large database, and i have to insert 12000 queries per second.
I have seen a MySql database ENGINE (tokudb) through which fast insertion can be achieved, and now i am using it.
But the problem is when i insert query in my database using QSqlQuery it takes approx(80 ms) which is the slowest insertion.

Anyone know how to achieve fastest insertion using Qt?

Lesiok
18th November 2013, 13:19
First off all make many inserts in one transaction. Default is one insert - one transaction and this is a problem.
Show how you do it.

karankumar1609
18th November 2013, 13:26
I am inserting using this code



QTime timex;
timex.start();

// static int i = 0;
QString query = TableQueries::insertDataQuery();
qDebug() << ++i;

qDebug() << "::> " << timex.elapsed();
db_query.prepare(query);
qDebug() << ":::> " << timex.elapsed();

int pos = 0;
db_query.bindValue(pos++, Id);
db_query.bindValue(pos++, date);
db_query.bindValue(pos++, time);
db_query.bindValue(pos++, value);

qDebug() << "::::> " << timex.elapsed();
bool exec = db_query.exec();
qDebug() << ":::::> " << timex.elapsed();


And the frequency of my data is around 12000 records/Second.
Then how can transaction be achieved?

Lesiok
18th November 2013, 14:30
A problem is that You runs this code for every record. Change this code and run it for group of records (ie. every 100 records) something like this :

QString query = TableQueries::insertDataQuery();
db.transaction();//begin of transaction
db_query.prepare(query);
bool exec = false;
for( int i = 0; i < Id_list.size(); i++ )
{
int pos = 0;
db_query.bindValue(pos++, Id_list.at(i));
db_query.bindValue(pos++, date_list.at(i));
db_query.bindValue(pos++, time_list.at(i));
db_query.bindValue(pos++, value_list.at(i));
exec = db_query.exec();
if( !exec )
break;
}
if( exec )
db.commit();
else
db.rollback();

Remember that this is some idea not working code.

karankumar1609
18th November 2013, 16:19
Hii Lesiok,

Thanks for your solution, I have tried your solution but somehow its not sufficient for me.

It inserts 120 records in 107millisecond, which means ~1200 records in one second.
But my live feed will be ~12000 records per second.
I am searching for something which can do more faster insertion.
I am using tokudb which claim to insert million records in a second, but somehow i dont know why my database not do the same for me.

Any idea about that?

Lesiok
18th November 2013, 17:40
It depends on many conditions. As you can see a single INSERT lasted 80 ms and 120 in one transaction 107 ms. And what if You increase the number of INSERT statements in a transaction? What will be the difference between a program compiled in debug mode and release? The database is on the same or on another computer?
I think that with speed of tokudb is like with car manufacturers. Car manufacturers indicate that the fuel consumption of 5 liters per 100 km. They forget only add that in the laboratory that the reality on the road have no relation and the car on the road consumes 6.5 liters per 100 km ;)

ChrisW67
18th November 2013, 21:10
You achieve a million inserts a second with multiple processors running multiple insert threads against a database engine on the same machine, running entirely in RAM, over a persistent low latency connection, into a table with no indexes, constraints or foreign keys, and with code that is optimised within an inch of its life. Even then you probably round the result up to the nearest good marketing value. In short the figure is unrealistic.

This code:


#include <QCoreApplication>
#include <QSqlDatabase>
#include <QSqlQuery>
#include <QSqlError>
#include <QDate>
#include <QTime>
#include <QDebug>

int main(int argc, char **argv) {
QCoreApplication app(argc, argv);

QSqlDatabase db = QSqlDatabase::addDatabase("QMYSQL");
db.setHostName("localhost");
db.setDatabaseName("test");
db.setUserName("test");
db.setPassword("");
if (db.open()) {
// Set up a test table
QSqlQuery qry;
if (qry.exec("CREATE TEMPORARY TABLE atable(id INTEGER, d DATE, t TIME, v INTEGER)")) {
QTime stopwatch;
QDate testDate = QDate::currentDate();
QTime testTime = QTime::currentTime();

db.transaction();
stopwatch.start();
qry.prepare("INSERT INTO atable VALUES(?, ?, ?, ?)");
for (int i = 0; i < 10000; ++i) {
qry.bindValue(0, i);
qry.bindValue(1, testDate);
qry.bindValue(2, testTime);
qry.bindValue(3, i);
if (!qry.exec())
qDebug() << qry.lastError();
}
qDebug() << stopwatch.elapsed();
db.commit();
}
else
qDebug() << qry.lastError();
}
else
qDebug() << db.lastError();
return 0;
}

Does 10000 inserts in 2.8 seconds over a 100Mbps LAN to a remote server.
Does 10000 inserts in 0.29 seconds over a local socket. (This Linux MySQL machine is faster than the other, but not 10 times faster)
Does 10000 inserts in 0.37 seconds over the loopback network interface. This scales to 29.8 and 38.1 seconds respectively for 1e6 records.

You do not, for example, want to be establishing a new connection for each insert. You do not want to be preparing a query for each insert: the query does not change, only the values. The mechanics of committing a transaction to disk are likely to take at least as a long as a complete rotation of the disk platter possibly two, or about a millisecond, so you cannot afford to commit often. network delays are likely to be a large component. You avoid doing type conversions if at all possible.


How long do you have to sustain 12000 records per second? What are the actual data types?

karankumar1609
19th November 2013, 06:35
Thanks for your solutions, ;):o

I have used the following data types in create table command.



CREATE TEMPORARY TABLE atable(id INTEGER, name varchar(255), date varchar(15), time varchar(15), value INTEGER)


My speed of insertion of 12000 records is 250millisecond, and it remains constant.
My current database contains 9630512 rows.

i have to work with live data and insert it into database.
Quite puzzeled how to create a better structure to capture the live data and store it. :confused::(

If you have any idea about that, please share with me , i will do the rest.
Thanks again.

ChrisW67
19th November 2013, 07:05
If you are inserting 12000 rows in 0.25 seconds then you already met your stated target. What is the problem?

karankumar1609
19th November 2013, 07:26
Sorry for the wrong post , that was 1200 records per 250 millisecond. :rolleyes:
And 12000 records per ~800(MINIMUM) millisecond.
The problem is i do not want to get stuck 800 millisecond while inserting records, because the live data comes at a very high frequency.
:(

ChrisW67
19th November 2013, 10:20
It would help if you answered some of the outstanding questions.


The database is on the same or on another computer?

How long do you have to sustain 12000 records per second? What are the actual data types?

Where is the bottleneck and what have to done to isolate it? There are physical limits to how much data you can push over a network but that is irrelevant if your database server is overloaded.

karankumar1609
19th November 2013, 10:55
The database is on the same or on another computer?

Yes the database is on the same computer on which i am working.


How long do you have to sustain 12000 records per second? What are the actual data types?
I will get 12000 records per second for 5 hours minimum.
and my current speed of insertion on 12000 records is approx 800 millisecond.
Data types used by table is VARCHAR and INT.

Lesiok
19th November 2013, 13:22
Maybe a solution is to write the samples to a regular file and upload them to the database in the other thread. In this way, You are immune to changes in the instantaneous performance of the server. For example, one minute records are going to the first file. After a minute, the records are going to a new file and the previous one is pushed to the database. And so on...

ChrisW67
19th November 2013, 23:33
You are receiving 12000 records in 1.0 second.
You are inserting 12000 records in 0.8 seconds.
Seems to me to me are keeping up just fine although I do wonder why your figure is 3 times longer than mine (Especially given your super-duper MySql backend vs my general MyISAM tables).

Here are some more questions to answer for your own benefit:
Where is the bottleneck? You still have not identified that. If you marshal the data into the query without actually executing it: how long does it take? What portion of the total time is this?
Does this data need to be stored in an RDBMS at all?
Is text the most efficient method of storing a date or time?
Can data be aggregrated on-the-fly and only the aggregrate data stored?
Can Lesiok's approach work for your application?

karankumar1609
20th November 2013, 05:23
Hello ChrisW67

I have implemented the same code as yours and its the total time of insertion of record into the table.


Where is the bottleneck? You still have not identified that. If you marshal the data into the query without actually executing it: how long does it take? What portion of the total time is this?

Problem is why my code takes ~0.8 second for insertion.
Before execution of query it hardly takes ~0-1 millisecond.
~0.8 second is the total time



QTime time;
time.start();

QSqlQuery query;
database_manager_.transaction();
for(int i = 0; i < queue.size(); ++i) {
...
...
query.exec();
}
database_manager_.commit();

qDebug() << ">> " << time.elapsed();



Does this data need to be stored in an RDBMS at all?
Yes data need to be stored in RDBMS


Is text the most efficient method of storing a date or time?
I will change these variables from string to date, time. It dosn't matter to me.


Can data be aggregrated on-the-fly and only the aggregrate data stored?
Data will be aggregated on-the-fly


Can Lesiok's approach work for your application?
Yes it works, but execution time is little high, Should i need any compiler optimization?

Lesiok
20th November 2013, 09:13
Yes it works, but execution time is little high, Should i need any compiler optimization?
What does it mean "execution time is little high" ? Maybe your computer is too weak for this job or misconfigured ?

karankumar1609
20th November 2013, 13:06
Thanks for your help i have changed the way i am inserting data into table.
Now i have to insert 300 records per second which is quite less as compared to 12000.
:);):D:p