一行程式碼改進：Logtail的多行日誌採集效能提升7倍的奧秘

阿里妹導讀

一個有趣的現象引起了作者的注意：當啟用行首正則表示式處理多行日誌時，採集效能出現下降。究竟是什麼因素導致了這種現象？本文將探索Logtail多行日誌採集效能提升的秘密。

背景

在日誌分析領域，Logtail作為一款廣泛使用的日誌採集工具，其效能的任何提升都能顯著提升整體效率。最近，在對Logtail進行效能測試時，一個有趣的現象引起了我的注意：當啟用行首正則表示式處理多行日誌時，採集效能出現下降。究竟是什麼因素導致了這種現象？接下來，讓我們一起探索Logtail多行日誌採集效能提升的秘密。

分析

要理解這一現象，首先需瞭解Logtail在處理多行日誌時的工作原理。Logtail的多行日誌合併功能基於特定的日誌格式將分散的多行資料聚合為完整事件。其工作流程如下：

1.使用者配置行首正則表示式。

2.Logtail對每行日誌開頭應用此正則。

3.若某行不匹配，Logtail繼續等待直至找到匹配的行首。

舉個例子，假設我們有如下的日誌格式，通常我們會配置行首正則為 cnt.*，Logtail會拿著這個正則對每行進行匹配，將這些單行日誌合併成一個完整的多行日誌。

cnt:13472391,thread:2,log:Exception in thread "main" java.lang.NullPointerExceptionat  com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitle

at com.example.myproject.Book.getTitle at com.example.myproject.Book.getTitle at com.example.myproject.Book.getTitle

從Logtail的實現機制來看，Logtail使用了boost::regex_match 函式進行的正則匹配。這個正則函式根據輸入正則reg，會對輸入的日誌buffer進行全量匹配。例如上面這個日誌，cnt.*會全量匹配第一行的1072個字元。

boolBoostRegexMatch(constchar* buffer, size_t size, const boost::regex& reg, string& exception){// ...if (boost::regex_match(buffer, buffer + size, reg))returntrue;// ...}

我編寫了以下測試程式碼發現，隨著與行首正則無關的日誌長度變長（.*匹配的那部分日誌），boost::regex_match 也線性地增長了。

staticvoidBM_Regex_Match(int batchSize){std::string buffer = "cnt:";std::string regStr = "cnt.*"; boost::regex reg(regStr);std::ofstream outFile("BM_Regex_Match.txt", std::ios::trunc); outFile.close();for (int i = 0; i < 1000; i++) {std::ofstream outFile("BM_Regex_Match.txt", std::ios::app); buffer += "a";int count = 0;uint64_t durationTime = 0;for (int i = 0; i < batchSize; i++) { count++;uint64_t startTime = GetCurrentTimeInMicroSeconds();if (!boost::regex_match(buffer, reg)) {std::cout << "error" << std::endl; } durationTime += GetCurrentTimeInMicroSeconds() - startTime; } outFile << i << '\t' << "durationTime: " << durationTime << std::endl; outFile << i << '\t' << "process: " << formatSize(buffer.size() * (uint64_t)count * 1000000 / durationTime) << std::endl; outFile.close(); }}intmain(int argc, char** argv){ logtail::Logger::Instance().InitGlobalLoggers();std::cout << "BM_Regex_Match" << std::endl; BM_Regex_Match(10000);return0;}

這時候我們就需要注意了，我們使用行首正則時，其實往往只需要匹配單行日誌開頭的一部分，例如這個日誌就是cnt，我們並不需要整個.* 部分，因為匹配這部分會消耗不必要的效能。特別是當日志非常長時，這種影響尤為明顯。

其實boost庫提供了boost::regex_search函式

只需設定合適的標誌（如boost::match_continuous）

就能實現僅匹配字首的需求，而這正是行首正則匹配所需求的。我們來看一下如何使用 boost::regex_search ：

boolBoostRegexSearch(constchar* buffer, size_t size, const boost::regex& reg, string& exception){// ...if (boost::regex_search(buffer, buffer + size, what, reg, boost::match_continuous)) {returntrue; }// ...}

在 Logtail 中，由於現有的行首正則實現方式需要，使用者的行首正則都帶有.*字尾，我們可以自動移除.*並在正則前新增^，以提升匹配效率。

和boost::regex_match 一樣，我也對boost::regex_search根據日誌長度進行了測試。可以發現，boost::regex_search的耗時基本穩定，沒有隨著日誌變大，耗時變長。

staticvoidBM_Regex_Search(int batchSize){std::string buffer = "cnt:";std::string regStr = "^cnt";boost::regex reg(regStr);std::ofstream outFile("BM_Regex_Search.txt", std::ios::trunc); outFile.close();for (int i = 0; i < 1000; i++) {std::ofstream outFile("BM_Regex_Search.txt", std::ios::app); buffer += "a";int count = 0;uint64_t durationTime = 0;for (int i = 0; i < batchSize; i++) { count++;uint64_t startTime = GetCurrentTimeInMicroSeconds();if (!boost::regex_search(buffer, reg)) {std::cout << "error" << std::endl; } durationTime += GetCurrentTimeInMicroSeconds() - startTime; } outFile << i << '\t' << "durationTime: " << durationTime << std::endl; outFile << i << '\t' << "process: " << formatSize(buffer.size() * (uint64_t)count * 1000000 / durationTime) << std::endl; outFile.close(); }}intmain(int argc, char** argv){std::cout << "BM_Regex_Search" << std::endl; BM_Regex_Search(10000);return0;}

效能測試

透過這樣調整後，我對改進前後的Logtail效能進行了對比測試，測試結果顯示效能有顯著提升。測試環境如下：

相同的ACK叢集，相同規格的機器（ecs.c7.4xlarge，16 vCPU，32 GiB，計算型 c7）；
2048GB ESSD雲盤，PL3規格；
Logtail 啟動引數保持預設；
列印日誌的程式一致；
相同的採集配置，只配置行首正則 cnt.*；
即時生成相同的日誌，該日誌的特點是行首的長度比較長；

cnt:13472391,thread:2,log:Exception in thread "main" java.lang.NullPointerExceptionat  com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitleat com.example.myproject.Book.getTitle

at com.example.myproject.Book.getTitle at com.example.myproject.Book.getTitle at com.example.myproject.Book.getTitle