Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
JUNRU LUNatural Language Processing andMachine Learning enthusiast[ [email protected] ½ Coventry, UK github.com/LuJunru ® lujunru.github.io
ACHIEVEMENTS3 National 6th position - SMP
• The 6th position in the final-round of the6th national-level Social Media Process-ing Contest. The contest asked to realize3 tasks including keywords extractionfrom blogs, user interests labeling, anduser growth prediction.
• For the 1st task, we used Tf-Idf, Tex-trank, LDA, andmanual rules combinedto find keywords. The 2nd task was real-ized by a stacking classification on docu-ment embeddings (a later improvementof 3%was to use TextCNN and self-attention). The final task was donewitha regressive stackingmodel [Codes].
3 University Honor• University Scholarships of Universityof International Relations 2015, 2016,2017 (Top 5%).
• 2018 excellent graduation thesis andoutstanding graduates of University ofInternational Relations.
3 EnactusWord CupThird prize of national final and firstprize of refional semi-final in 2018 En-actusWorld Cup.
3 Like a Boss Product Design ContestGlobal champion, awardedwith$30,000. This contest is hosted byBoss Zhipin App, aiming to invite stu-dents to improve product design forthem. [OfficialWebsite]
STRENGTHS& SKILLSNatural Language ProcessingMachine Learning Web Crawler
Pytorch PySpark SkLearn
CODINGPython ○○○○○Sql ○○○○○Javascript ○○○○○
EDUCATIONMPhil/Ph.D. candidate in Computer ScienceUniversity ofWarwick� Nov. 2019 –Nov. 2023 ½ Coventry, UK• Supervisor: Prof. Yulan He• Department: Computer Science• Research Interests: Text representation, question answering and controllabletext generation. Currently, mywork is around question answering under realworld scenarios.
M.S. in Applied Urban Science & InformaticsNewYork University� Sept. 2018 – Sept. 2019 ½ NewYork City, USA• GPA: 3.67/4.00• Center: Center for Urban Science + Progress• Department: Tandon School of Engineering• Project1: Collect geo-tagged tweets in NYC using Twitter API and createemotional time-spatial maps through sentiment analysis [Codes].
• Project2: With RandomForest and LGBMmodels, use Yelp reviews and re-lated user and business historical infos to predict the reviews’ ratings [Codes].
• Capstone project: Using DID and Bayesian Network to infer causality fromthe increase of Uber & Lyft on NYC’s parking violation. [Codes].
B.Eng. in InformationManagement and Information SystemUniversity of International Relations� Sept. 2014 – July 2018 ½ Beijing, China• GPA: 90.6/100• Supervisor: Prof. Binyang Li• Department: School of Information Science and Technology• Final Thesis: a two-stagemulti-attentionMachine Reading Comprehensionmodel (A-Reader). In A-Reader, text representation is realized with self-attention, while semantic interaction between article and question is based onself-attention and bi-attention. "Two-stage" refers to firstly use the final se-mantic matrix (FSM) within a binary classificationmodel to select a best para-graph, and secondly predict the answer via pointer network with the FSM.
Double Bachelor of EconomicsPeking University� Sept. 2015 – July 2018 ½ Beijing, China• GPA: 84.2/100• Department: National School of Development• Major courses: Accounting, Econometrics, Microeconomics, and Finance.• Notes: TheDouble B.Eco. is a program for non-economics undergraduatesfrom PKU and other universities who are interested in economics since 2003.
https://github.com/LuJunruhttps://lujunru.github.io/https://github.com/LuJunru/SMPCUP2017_ELPhttps://www.zhipin.com/zt/2019/likeaboss/pc/https://sites.google.com/view/yulanhe/home?authuser=0https://github.com/LuJunru/ADS2018_Final_Projecthttps://github.com/LuJunru/MLC2019_Final_Projecthttps://github.com/LuJunru/NYU-CUSP-Capstone-2019https://xinke.uir.cn/c/2016-02-26/541594.shtmlhttp://en.nsd.pku.edu.cn/programs/doublemajor/index.htm
PUBLICATIONSq Journal Articles• Lu, Junru et al. (2019). “Identifying UserProfile by Incorporating Self-AttentionMechanism based on CSDNData Set”. In:Data Intelligence 1.2, pp. 160–175.
Conference Proceedings• [Accepted by COLING 2020] Lu, Junru etal. (2020). “Chime: Cross-passage Hierar-chical Memory Network for GenerativeReviewQuestion Answering”. In: The 28thInternational Conference on ComputationalLinguistics.
• Cai, Junjie et al. (2020). “Estimating theeffect of Uber & Lyft on parking violationin NYC”. in: Transportation Research BoardAnnual Meeting.
last updated: July 2nd, 2019
WORKEXPERIENCENatural Language Processing Engineer InternshipBeijing iDeepWise Artificial Intelligence Tech Ltd� Mar. 2018 – June 2018 ½ Beijing, China• Project1: Research on two sentence similarity models, TextCNN and Siamese-LSTM. TextCNN is to construct 2D input by calculatingW2V-Cosine-Similarityof sentences’ sub-units, while Siamese-LSTM is to useManhattan distance tocompare the final hidden states of the sentences encodedwith LSTM [Codes].
• Project2: Implement a two-stage BiDAFmodel on Dureader Dataset. TheBiDAFmodel is a traditionalMachine Reading Comprehensionmodel, whichuses bi-LSTM to realize text representation and bi-attention on semantic in-teraction. "Two-stage" refers to firstly select the best paragraphwithmanu-ally selected features, and secondly predict the answer via Pointer Network.
DataMining Engineer Internship, Dept. of TextMiningBeijing Baifendian InfoTech Ltd� Oct. 2017 –Mar. 2018 ½ Beijing, China• Project: Develop a single-round Community-based Chinese DeepQA System.The system consists of (Q, A) knowledge database, first-round query enginebased on Elasticsearch, second-round selectivemodules including semanticsimilarity check on (NewQ, OldQ) and pair quality check on (NewQ, Old A),and compensatory web crawler for unknown new questions [Codes].
https://github.com/LuJunru/Sentences_Pair_Similarity_Calculation_Siamese_LSTMhttps://github.com/LuJunru/My_QA_Robot