Fan Yeliang (Leo Van)
- Deep Learning
- Reinforcement Learning
- Causal Inference & Reasoning
- Computer Vision
- Internet of Things, IoT, AIoT
- Intelligent Agriculture, Agritech
- Natural Language Processing
- Complex Network
- Knowledge Graph
- 2012.09 ~ 2015.03, Hebei University of Technology, M.S. in Business Management
- 2008.09 ~ 2012.07, Hebei University of Technology, B.S. in Information Management
- 2015.03 ~ Present, JD Finance, Senior Research Engineer
- 2013.11 ~ 2014.02, Founder International Co., Ltd., Algorithm Engineer
2019.07 ~ Present, Senior Research Engineer
Daat (Complex Network & Knowledge Graph)
2018.04 ~ 2019.06, Project Leader
- Data Knowledge Engineering & Data QA System: Design and development of ontology of data warehouse, data market and data tools. Based on the ontology and the extracted knowledge, we build the knowledge base of data. We also develop the data QA system with techniques such as: intent classification, slot filling, query rewrite, ranking and question matching based on DSSM. Data QA system is aimed at improving the usability and convenience for users to make use of data warehouse and data market. It can also answer the questions related to data concept, data processing flow and data tools.
- Automatic Sensitive Information Identification：Development of automatic sensitive information identification for data warehouse, which helps to make data encryption policy. The model is based on the Wide & Deep network with meta-information of the data (e.g., table name, table comment, column name, column comment, etc.) and value-information of the data (e.g., the data values of every column). Building the Wide network with extracted traditional features and the Deep network with text features using Char Embedding + CNN, it achieves 95%+ of the F1-Score on test data.
- Large Scale Heterogeneous Network Embedding: Development of large scale (ten millions of vertexes and hundred millions of edges) heterogeneous network embedding algorithm. We implement the algorithm based on meta-path with rich business meanings, and provide the embedding results as features for other models.
- Recommendation and Marketing based on User Network and User Behavior: Leveraging historical orders, we build a large heterogeneous network of users which contains users, address, goods, and etc. With the embedding results of this network, we develop an algorithm for candidates generation of recommendation, which achieves 20%+ improvement compared with traditional methods.
All Seeing Eyes (Chinese Address Analytics)
2015.03 ~ 2018.04, Project Leader
- Development of Chinese address analytics algorithms, including: segmentation, classification, integrity, POI identification and similarity (accuracy 90%+).
- Development of Address Profile System based on the basic algorithm engine. It increased the conversion rate of users by 30%+ in the offline payment service.
- Development of the anti-fraud and credit model based on the Chinese address analysis system. The anti-fraud model identified illegal encashment orders with 200,000 CNY/day, and more than 10 million users were granted credit with the credit model.
- This project has beed awarded the “Innovation Seed” prize of JingYa Cup Innovation Competition in JD.com ranking 20 of 378.
- Development of Enterprise Address Profile System based on the basic algorithm engine which was part of JD Enterprise Credit.
- Development of Rural Finance Service Station Location Solution base on the Address Profile System.
User Behavior Analytics
2017.10 ~ 2017.12, Algorithms Engineer
- Development of a user behavior representation method named on Behavior2Vec. Based on hierarchical clustering and depth search, a hybrid model for identifying user abnormal behavior is proposed. Compared with Bag of Words and N-GRAM methods, the number of abnormal users identified is 3+ times of traditional methods.
2015.03 ~ 2015.10, Algorithms Engineer
- Development of a hybrid product life cycle identification model based on Bass Diffusion model, optimized time series similarity method and clustering method. It got an accuracy of 95%+ when identifying the excess inventory products, which helped to make loans goods pledge decisions and calculate the loan-to-value ratio.
- Development of product information fusion model and system with ElasticSearch which got 90%+ recognition accuracy and provided accurate and relevant information, such as price, etc.
Smart Public Security (EzMap)
2013.11 ~ 2014.02, Algorithms Engineer
- Development of a series prediction algorithm based on EMD and SVR for crime analysis and time series prediction in smart city project.
- Development of a serial crime cases identification algorithm based on k-Prototypes.
- Development of public security metadata import tools and XML based police GIS database upgrade tools.
- R: starstarstarstarstar_border
- Python: starstarstarstarstar_border
- SQL: starstarstarstarstar_border
- HTML / CSS / JS: starstarstar_borderstar_borderstar_border
- Lisp: starstarstar_borderstar_borderstar_border
- Tensorflow: starstarstarstar_borderstar_border
- PyTorch: starstarstar_borderstar_borderstar_border
- Qt: starstarstar_borderstar_borderstar_border
- Axure: starstarstarstarstar_border
- Sketch: starstarstarstarstar_border
- Omnigraffle: starstarstarstarstar_border
- Zhou, F., Yin, H., Zhan, L., Li, H., Fan, Y., & Jiang, L. A Novel Ensemble Strategy Combining Gradient Boosted Decision Trees and Factorization Machine Based Neural Network for Clicks Prediction. In 2018 International Conference on Big Data and Artificial Intelligence (BDAI) (pp. 29-33). IEEE.
- Li, J., Fan, Y.*, Xu, Y., & Feng, H. (2013, December). An Improved Forecasting Algorithm for Spare Parts of Short Life Cycle Products Based on EMD-SVM. In Information Science and Cloud Computing Companion (ISCC-C), 2013 International Conference on (pp. 722-727). IEEE.
- Research on Dynamic Pricing Strategies of Digital Products based on Network Externality. Master Thesis, 2014.
- A kind of Chinese address segmenting method and system (CN 105159949, Issued, 2015)
- Product inventory predicting method and product inventory predicting device (CN 106056239, Under Examination, 2016)
- Product life cycle identification method and device (CN 106408217, Under Examination, 2017)
- Address similarity calculation method and apparatus (CN 107239442, Under Examination, 2017)
- Data warehouse information processing method, device, system, medium (CN 109388637, Under Examination, 2018)
- The method and apparatus for determining the type of literary name section (CN 109784407, Under Examination, 2019)
- A kind of data processing method, device, equipment and medium (CN 110309235, Under Examination, 2019)
Open Source Projects
- Data Science Introduction With R, a getting started tutorial of data science based on R (in Chinese).
- Sci-Hub EVA, Sci-Hub EVA is a cross-platform Sci-Hub GUI application.