In the first two articles, we introduced the application background of NLP technology in Yixin-“agile AI | NLP technology practice in Yixin business [background article]” and one of the application scenarios, “agile AI | NLP technology practice in Yixin business [intelligent chat robot article]”. This is another scene, that is, how to build customer portraits in business. Please watch ~
Author’s brief introduction
Jing Yuxin: Graduated from eecs with a doctor’s degree. His research interests include computer software and theory, logical reasoning, etc. He currently works in Yixin Technology Research and Development Center and is engaged in research on artificial intelligence, machine learning, natural language processing and knowledge engineering.
Construction of Customer Portraits for Advanced Scenes
In many enterprises, daily communication between business personnel and customers will produce a large number of records, which may include customer service communication data (call records, call summaries), and may also include a variety of report data (accompanying visit reports, credit reports, etc.) (see Figure 1).
Figure 1 Communication Records between Business Personnel and Customers
The former may have more spoken language, while the latter is mainly written language. However, there is a common feature between the two, that is, they both contain rich customer information. To extract this information, we need to make use of (NLP) technology.
Fig. 2 is an excerpt from a client’s accompanying visit report. after observing its text features, we found that there are many information of business concern, such as professional aspect, the client is a “university professor”. In terms of investable assets, the amount of financial management is “1 million”, the type of investment is “bank financial management”, and the attitude towards the company is “not understanding”, etc.
Figure 2 Example of Customer Accompanying Visit Report
Therefore, we can completely analyze the text through NLP, label and extract the customer features, and finally construct the customer portrait using the obtained labels. There are many benefits to this, for example, it is convenient for our business personnel to find key problems at any time and follow up. Automatic processing to improve work efficiency. After the customer tag portrait is constructed according to the excavated information, the demand characteristics in a specific time range can be easily counted, coefficient reference can be provided for new product setting, or structured field contents can be supplemented and verified.
The overall implementation route is shown in fig. 3. firstly, the tag library concerned by the business is defined through business analysis, and then the corresponding extraction model is trained for the defined tags. finally, the data is analyzed by using the model to obtain a series of customer tags, which are then summarized to finally form a customer portrait.
Fig. 3 general implementation route
The overall route is like this, but we also need to pay attention to some details in the specific implementation process. Through the analysis of the previous data, we found some characteristics, such as the high concentration of information in the text, the information expression is usually based on short sentences, but the semantics of a single short sentence is ambiguous, and it needs to be analyzed in combination with certain context. Therefore, we need to cut the complex sentences appropriately, determine the appropriate data granularity, and at the same time, cooperate with a short sentence sliding window of appropriate size to capture the relevant context semantics.
In addition, for the internal business text, its content involves a large number of product proprietary entity names and terms, as well as more numbers. In view of this situation, we have established a special word bank and entity library to accurately cut and identify the corresponding entity names and terms. For digital processing, we have compared the technical schemes of word vector, identifier replacement, rule recognition+post-processing, etc. and have chosen the method with the best effect.
Of course, we also face the common problem of insufficient annotated corpus, so in this project, we focus on how to carry out Few-shot learning under the condition of small samples.
In fact, in the implementation of AI projects in most professional fields, there is insufficient labeling data, so Few-shot learning for small samples is becoming more and more important. Few-shot learning includes many technologies, including common transfer learning +fine-tuning technology, such as Bert；; There are also some techniques based on semi-supervised training, such as some neural network models based on similarity measurement, sample labeling diffusion based on nearest neighbor algorithm, etc. There are also meta-learning related technologies, such as OpenAI bestpaper on ICLR 2018; There are even some related technologies of graph networks.
Among the above technologies, the one that is more suitable for engineering and easier to implement is the method based on transfer learning. In our project, we found that transfer learning, which is based on the pre-training model and transferred to the target training tasks, plus semi-supervised learning tagging assistance, can better meet our needs.
Let’s introduce our algorithm flow:
Firstly, complex sentences are cleaned and cut. After that, some filtering rules can be selectively added to quickly remove the more obvious noise data. Then the data flows into the label extraction model to obtain specific labels; Finally, all the labels obtained are de-duplicated and disambiguated in the portrait construction stage to form the final customer portrait.
As for the algorithm model, we also compare many methods successively. In essence, we think the label recognition model is a short text classification algorithm. We have tried the statistical-based method (SVM, Random Forest, XgBoost) and the neural network-based model (FastText,Text CNN/RNN/RCNN, HAN). Finally, we have chosen the HAN model. In other words, the Hierarchical Attention Network model, through RNN and Attention calculation at the word level and sentence level respectively, finally obtains a reasonable text vector representation for final classification. the whole process is shown in fig. 4.
Figure 4 HAN Model Architecture
Fig. 5 is the overall processing flow of this example. after data preprocessing, the text is distributed in parallel to each business concern label extraction model, and each business label is output. finally, it is summarized to the customer portrait construction module, where duplication is removed, ambiguity and contradiction are resolved, and finally the customer portrait is obtained.
The overall flow of the example processing of fig. 5
In addition, we have designed a corresponding real-time AI solution based on the company’s agile real-time data platform. as shown in fig. 6, some open source technologies of our team are used here, including DBus (data bus platform), Wormhole (streaming processing platform), Moonbox (computing service platform) and Davinci (visual application platform). these four platforms form the agile big data platform stack.
In this scheme, we collect natural language data in various data stores through DBus, and obtain corresponding text through some optional technologies (such as ASR, etc.); Then, WormOLE is used for real-time streaming processing. The label model is run on the real-time data stream of WormOLE, and the corresponding labels are automatically extracted from the text in the data stream, and then exported to the designated data store by WormOLE. After that, Moonbox carries out the follow-up summary processing on the labels. First, the labels calculated before are extracted from the storage medium, the portrait is constructed using the portrait model, output to the storage medium such as Redis, and finally pushed to the service system for use. This is a real-time user portrait processing flow that we have implemented.
In addition, in the data flow branch at the bottom of fig. 6, we make a selective sampling of the production data flow flowing on Wormhole, and then use the label model and the portrait model to calculate the customer portrait. after that, we display the original data, label data and customer portrait to our model maintenance personnel through Davinci for evaluating and checking the operation of the model, thus realizing a real-time model effect monitoring system. Combining the two, we get a real-time portrait construction system based on text analysis.
As enterprises pay more attention to natural language data, NLP+AI technology has become a very important and core basic technical service in various fields. The combination of domain knowledge and NLP technology has brought new technical products and created new commercial values. For example, some products that we currently use: Siri, Xiao Ai, etc. This kind of Conversational UI not only brings a brand-new interaction mode, but also opens up a new product field.
In terms of data, although there is a large amount of natural language data in stock, there is still a lack of processed high-quality natural language data resources in both general and professional fields, so it has very high value. The accumulation of domain corpus can greatly improve the effect of AI products and help enterprises form new data barriers and technical barriers to a certain extent.
In terms of NLP algorithm, in the future, as mentioned above, Few-shot Learning facing small corpus tasks will receive more and more attention, especially the transfer learning technology represented by Bert, which will bring a revolution to some NLP tasks now. In addition, there are data enhancement technologies for NLP corpus. We know that data enhancement technologies are mature in image field and are a common data processing method. However, the development of data enhancement technologies in NLP field is not mature enough. If there is a breakthrough in this aspect, we believe it will be of great help to all kinds of NLP tasks.
The development of NLP technology also requires the joint efforts of various enterprises in the industry, algorithms and engineering experts. I believe we can understand natural language data in various fields more accurately, faster and more conveniently in the future.
Author: Jing YuxinYixin Institute of Technology