Inconsistent counts between no sub-aggregration and with sub-aggregation

Here is my code to make aggregations :

public function getNbContributions(ParamFetcher $paramFetcher)
    {
        $index = $this->client->getIndex($this->container->getParameter('nuxeo_doc_index'));
        $query = new ElasticaQuery();
        if ($paramFetcher->get('periode') != null) {
            $histogramInterval = '';
            if (strtolower(trim($paramFetcher->get('periode'))) == 'a') {
                $histogramInterval = 'year';
            } else if (strtolower(trim($paramFetcher->get('periode'))) == 'm') {
                $histogramInterval = 'month';
            } else if (strtolower(trim($paramFetcher->get('periode'))) == 's') {
                $histogramInterval = 'week';
            }
            $dateAggregation = new ElasticaAggregationDateHistogram('dateCreateHistogram', 'date_create', $histogramInterval);
            $dateAggregation->setFormat("dd-MM-YYYY");
            $tagsAggregation = new ElasticaAggregationTerms('tagsAggregation');
            $tagsAggregation->setField('tag.keyword'); // .keyword est nécessaire pour résoudre le problème "Fielddata is disabled on text fields by default"
            $dateAggregation->addAggregation($tagsAggregation);
            $query->addAggregation($dateAggregation);
        }
        else {
            $tagsAggregation = new ElasticaAggregationTerms('tagsAggregation');
            $tagsAggregation->setField('tag.keyword'); // .keyword est nécessaire pour résoudre le problème "Fielddata is disabled on text fields by default"
            $query->addAggregation($tagsAggregation);
        }
        $query->setSize(0);
        $found = $index->search($query);
        if ($paramFetcher->get('periode') != null) {
            $ret = json_encode($found->getAggregation('dateCreateHistogram'));
        }
        else {
            $ret = json_encode($found->getAggregation('tagsAggregation'));
        }
        $ret = json_decode($ret, true);
        return $ret;
    }

As you can see there is the paramfetcher data periode ; when I do not set it then I get this result :

{
    "doc_count_error_upper_bound": 0,
    "sum_other_doc_count": 2,
    "buckets": [
        {
            "key": "selfcare",
            "doc_count": 2
        },
        {
            "key": "telma",
            "doc_count": 2
        },
        {
            "key": "angular",
            "doc_count": 1
        },
        {
            "key": "basedeconnaissance",
            "doc_count": 1
        },
        {
            "key": "elasticsearch",
            "doc_count": 1
        },
        {
            "key": "html",
            "doc_count": 1
        },
        {
            "key": "image",
            "doc_count": 1
        },
        {
            "key": "php",
            "doc_count": 1
        },
        {
            "key": "pulseacademy",
            "doc_count": 1
        },
        {
            "key": "q2a",
            "doc_count": 1
        }
    ]
}

And when I set the paramfetcher data periode to "a" then I get this result :

{
    "buckets": [
        {
            "key_as_string": "01-01-2018",
            "key": 1514764800000,
            "doc_count": 11,
            "tagsAggregation": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 0,
                "buckets": []
            }
        },
        {
            "key_as_string": "01-01-2019",
            "key": 1546300800000,
            "doc_count": 0,
            "tagsAggregation": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 0,
                "buckets": []
            }
        },
        {
            "key_as_string": "01-01-2020",
            "key": 1577836800000,
            "doc_count": 26,
            "tagsAggregation": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 2,
                "buckets": [
                    {
                        "key": "selfcare",
                        "doc_count": 2
                    },
                    {
                        "key": "telma",
                        "doc_count": 2
                    },
                    {
                        "key": "angular",
                        "doc_count": 1
                    },
                    {
                        "key": "basedeconnaissance",
                        "doc_count": 1
                    },
                    {
                        "key": "elasticsearch",
                        "doc_count": 1
                    },
                    {
                        "key": "html",
                        "doc_count": 1
                    },
                    {
                        "key": "image",
                        "doc_count": 1
                    },
                    {
                        "key": "php",
                        "doc_count": 1
                    },
                    {
                        "key": "pulseacademy",
                        "doc_count": 1
                    },
                    {
                        "key": "q2a",
                        "doc_count": 1
                    }
                ]
            }
        }
    ]
}

So why is the doc_count property equal to 26 for "key_as_string": "01-01-2020" though the total doc is not 26 if there is no paramfetcher data ?

Source: Symfony Questions

Was this helpful?

0 / 0

Leave a Reply 0

Your email address will not be published. Required fields are marked *