QSortFilterProxyModel quite slow for bigger data-models [Archive]

View Full Version : QSortFilterProxyModel quite slow for bigger data-models

kerim

8th April 2011, 10:03

hi,

are there any solutions for the following problems? -->

1. my QSortFilterProxyModel is quite slow when the source-data-model contains bigger amount of data (e.g. for a logfile containing about 100.000 lines the linewise filtering with a regular expression and also with a fixed string takes above 1 minute which is quite slow -> any alternatives?

2. if there is no solution for 1. how can i be notified about when the proxy has finished filtering to react on it within my gui ??

3. is it possible to merge two different successive filter results ?
--> the results of a filtering are shown within a table-view, the filter-results of a second filter-request should merge into the results from the first search .. is this possible!?

thnx.

wysota

8th April 2011, 10:58

1. my QSortFilterProxyModel is quite slow when the source-data-model contains bigger amount of data (e.g. for a logfile containing about 100.000 lines the linewise filtering with a regular expression and also with a fixed string takes above 1 minute which is quite slow -> any alternatives?
Don't use regular expressions and try to cache results from previous filters.

2. if there is no solution for 1. how can i be notified about when the proxy has finished filtering to react on it within my gui ??
You can't :)

3. is it possible to merge two different successive filter results ?
--> the results of a filtering are shown within a table-view, the filter-results of a second filter-request should merge into the results from the first search .. is this possible!?
Please elaborate.

kerim

8th April 2011, 12:02

Don't use regular expressions and try to cache results from previous filters.

hmm, how do i cache results ? is there any option or switch from the proxy to be set/turned on for the results to be cached?

what i discovered too: the second filter request to my proxy is way faster then the first time, so there must be some automatic caching in the background, but the first one being so slow is annoying.

You can't :)
thats bad. why isnt there any signal from proxy side being send when filtering has finished?

Please elaborate.
i ment following:
imagine some data being visualized by a QTableView which has a certain custom-made model.
further imagine another QTableView (FilterView) which displays the results from a proxy-filter-request.

whenever i start a filter-request to my QSortFilterProxyModel the previous results within my view are lost (overwritten) by the current filter-results.

for example: i search for all lines within my source-view that contain the string "HELLO" --> the proxy filters the model-data and puts the results matching that pattern into my filter-view. now if i want to filter all lines containing the string "WORLD" i would like these lines to appear at the appropriate line positions within my filter-view too and NOT having my previous results be cleared.

greets.

wysota

8th April 2011, 12:19

hmm, how do i cache results ?
It depends on how your model works.

is there any option or switch from the proxy to be set/turned on for the results to be cached?
No, you have to implement that yourself.

what i discovered too: the second filter request to my proxy is way faster then the first time, so there must be some automatic caching in the background, but the first one being so slow is annoying.
The speedup is not related to filtering. There is no cache in QSortFilterProxyModel. Some data might be cached by the view and that's what might be causing the speedup.

thats bad. why isnt there any signal from proxy side being send when filtering has finished?
Filtering is done synchronously and regular model signals are emitted from it - like rowsAdded(), rowsRemoved(), etc. You could attach to those.

i ment following:
imagine some data being visualized by a QTableView which has a certain custom-made model.
further imagine another QTableView (FilterView) which displays the results from a proxy-filter-request.

whenever i start a filter-request to my QSortFilterProxyModel the previous results within my view are lost (overwritten) by the current filter-results.

for example: i search for all lines within my source-view that contain the string "HELLO" --> the proxy filters the model-data and puts the results matching that pattern into my filter-view. now if i want to filter all lines containing the string "WORLD" i would like these lines to appear at the appropriate line positions within my filter-view too and NOT having my previous results be cleared.
You can reuse earlier calculated data but you need to gather, store and use it yourself by subclassing QSortFilterProxyModel.

kerim

8th April 2011, 14:12

just 4 not misunderstanding u (wysota):

caching would be then to store pairs of QModelIndex and QVariants within the subclass of QSortFilterProxyModel ??
i would then connect to the signal (lets say) QAbstractItemModel::rowsAboutToBeInserted
and push the cache to my proxy-model via setData(..) ???

wysota

8th April 2011, 14:39

caching would be then to store pairs of QModelIndex and QVariants within the subclass of QSortFilterProxyModel ??
Caching is just a concept to speed things up by reusing already calculated data. The way you employ the concept depends on a particular use-case.

i would then connect to the signal (lets say) QAbstractItemModel::rowsAboutToBeInserted
and push the cache to my proxy-model via setData(..) ???
No, definitely not.

Let's assume a very simple example - you have a filter model that works on a simple string that is checked against the contents of the base model - if the string is found in the base model's row, the row is accepted by the filter. If at some point your filter string is "abc" and then you receive a new string - "abcd" then you know that the result will certainly not contain any rows that have been already filtered out by "abc" so you only need to check the rows that contain "abc" but do not contain a trailing "d". Thus you can reuse your earlier (cached) calculations in order to "repair" your result. Remember that what makes QSortFilterProxyModel work is the QSortFilterProxyModel::filterAcceptsRow() method. Here's a sample implementation (not checked with a compiler so it might not build):

class IncrementalSortFilterProxyModel : public QSortFilterProxyModel {
Q_OBJECT
public:
IncrementalSortFilterProxyModel(...) : ... {

}
public slots:
void setFilter(const QString &str) {
recalculateFilter(str):
}
protected:
bool filterAcceptsRow ( int source_row, const QModelIndex & source_parent ) const {
if(m_lastFilter.isEmpty() || m_cache.contains(source_row)) return true;
return false;
}
void recalculateFilter(const QString &str) {
if(!m_lastFilter.isEmpty() && str.contains(m_lastFilter)) {
// update the cache by removing items
for(QSet<int>::iterator iter = m_cache.begin(); iter!=m_cache.end();) {
QModelIndex idx = sourceModel()->index(*iter, 0);
if(!idx.data().toString().contains(str))
m_cache.erase(iter); // remove from cache
else ++iter;
}
} else if(!m_lastFilter.isEmpty() && m_lastFilter.contains(str)) {
// update cache by adding items
for(int i=0;i<sourceModel()->rowCount();++i){
if(m_cache.contains(i)) continue;
QModelIndex idx = sourceModel()->index(i, 0);
if(idx.data().toString().contains(str)) m_cache.insert(i);
}
} else {
// recompute everything
m_cache.clear();
if(!str.isEmpty())
for(int i=0;i<sourceModel()->rowCount();++i){
QModelIndex idx = sourceModel()->index(i, 0);
if(idx.data().toString().contains(str)) m_cache.insert(i);
}
}
m_lastFilter = str;
invalidateFilter();
}
private:
QString m_lastFilter;
QSet<int> m_cache;
};

Of course this works only for simple cases. For complex ones you need to build upon the solution.

BTW. You might even recompute the cache in an external thread and only then call invalidateFilter() to reset the views.

kerim

8th April 2011, 14:51

thnx wysota 4 ur (fast posted) code proposal :))

but i think we misunderstood each other, actually my fault:
my question from my previous post using the caching term was not in regard to speed up filtering (thats a different problem and i will definetly use ur thoughts on this to solve that).

what i was asking in my last posting was in regard to situations where i for example enter some kind of search pattern within a line-edit then press enter and let the proxy filter data.
the filtered data is then shown within a table-view which is connected to my filterproxymodel.
now, the next time when i enter a new search pattern i do NOT want the current results (displayed in the table-view) to be cleared but the next filter-query-results to be inserted into it while the new filter results should be merged consistently into the existing ones: first query filters lines 4 an 6 from the source-model (view displays line 4 and 6), the second query filters line 5 --> so the view should display then lines 4, 5, 6 and not only 5.
(dont know if its clear :confused:)

wysota

8th April 2011, 14:58

what i was asking in my last posting was in regard to situations where i for example enter some kind of search pattern within a line-edit then press enter and let the proxy filter data.
the filtered data is then shown within a table-view which is connected to my filterproxymodel.
now, the next time when i enter a new search pattern i do NOT want the current results (displayed in the table-view) to be cleared but the next filter-query-results to be inserted into it while the new filter results should be merged consistently into the existing ones: first query filters lines 4 an 6 from the source-model (view displays line 4 and 6), the second query filters line 5 --> so the view should display then lines 4, 5, 6 and not only 5.
(dont know if its clear :confused:)
That's the same case. Note that your new search term is an alternative of the previous search term and the current one ("row contains term1 or term2"). You just use the cache differently than in my example, the idea remains the same.

You might even have separate caches for each term and then pass the filter as a string list and simply check which filter results you already have cached and which you're lacking and only compute those. Then in filterAcceptsRow() check that any of the active caches contains the row in question.

kerim

11th April 2011, 09:57

one more question left:

i have the following method within my custom-subclass from QSortFilterProxyModel
which is called whenever filtering procedure begins (from outside).

the problem is: it crashed at the line, where the created model-index is mapped to the source-model-index but i could not figure out what i am doing wrong.

does anyone know why?

// //////////////////////////////////
void FilterProxyModel::FillCache()
// //////////////////////////////////
{
if( !cachingActive )
return;

for(ulong row=0; row<rowCount(); row++)
{
QModelIndex pIdx = createIndex(row,0);
QModelIndex sourceRow = mapToSource(pIdx); // <<--- crashes here :((
if( !cache.contains(sourceRow.row()))
cache.insert(sourceRow.row());
}
}

--->> ok i got it:

i should have used index(..) rather than createIndex(..)

wysota

11th April 2011, 10:05

Line #10 is invalid. You can't use createIndex() like that. Use index() instead.