管理评论 ›› 2023, Vol. 35 ›› Issue (9): 142-154.

• 电子商务与信息管理 • 上一篇    下一篇

基于主题模型的众包任务复杂度测量研究

刘忠志1,2, 赵明1   

  1. 1. 苏州大学东吴商学院, 苏州 215021;
    2. 苏州大学智慧供应链研究中心, 苏州 215021
  • 收稿日期:2021-10-06 出版日期:2023-09-28 发布日期:2023-10-31
  • 通讯作者: 赵明(通讯作者),苏州大学东吴商学院硕士研究生。
  • 作者简介:刘忠志,苏州大学东吴商学院副教授,硕士生导师,博士。
  • 基金资助:
    国家自然科学基金面上项目(71874118)。

Measuring Task Complexity in Tournament-based Crowdsourcing: A Topic Modeling Approach

Liu Zhongzhi1,2, Zhao Ming1   

  1. 1. Dongwu Business School, Soochow University, Suzhou 215021;
    2. Research Center for Smarter Supply Chain, Soochow University, Suzhou 215021
  • Received:2021-10-06 Online:2023-09-28 Published:2023-10-31

摘要: 在竞赛型众包活动中,任务复杂度是影响众包参与规模和绩效的重要因素之一。现有研究主要通过参与者行为等信息间接测量任务复杂度,考虑较多主观因素,干扰任务复杂度测量的效度,产生一定的测量误差和不一致的结论。本文以Topcoder平台的3205个竞赛型程序设计任务为样本,通过主题模型(latent dirichlet allocation,LDA)分析任务描述,提取38个主题模块信息,进而从模块复杂度、协调复杂度和动态复杂度三个维度构建任务复杂度指标。通过探索性因子分析和回归分析,本文发现技术模块数、可读性指标、模块集中度和动态复杂度四个客观任务复杂度指标具有良好的区分效度,对于参与规模的影响表现出良好的预测效度。此外,这四个客观任务复杂度指标与人工评估的结果具有较强的一致性。本文有助于解决客观任务复杂度测量这一具有现实意义和挑战性问题,进一步推动众包任务设计、参与行为和绩效等方面研究,对于平台管理者多角度分析任务复杂度、有效匹配任务复杂度与其他任务设定因素也具有一定的借鉴意义。

关键词: 主题模型, 竞赛型众包, 任务复杂度, 文本分析, 机器学习

Abstract: In tournament-based crowdsourcing, task complexity significantly affects the crowd size of a contest and the quality of solutions. Existing literature mainly depends on participants' behaviors to get an indirect measure of task complexity. This measurement thus contains many subjective information which leads to measurement errors and inconsistent research conclusions. Due to lack of objective and effective measurement, it is difficult for firms and participants to match the task complexity with other contest parameters and personal characteristics. Therefore, effectively measuring task complexity becomes a challenging pursuit in empirical crowdsourcing research. This study takes 3,205 samples of Topcoder platform and implements the Latent Dirichlet Allocation (LDA) to extract 38 topics. This study constructs three objective task complexity using the corresponding topic characteristics. The factor analysis and negative binomial regression method are used to refine and validate the three measurements. This study finds that the measurements based on topic modeling have acceptable discriminant validity among other measurements and reliable convergent validity within the measurements. Regression results show that the technology modules, contest readability, modularity and dynamic complexity negatively affect crowd size (i.e., registrants and submitters). This finding is consistent with theoretical predictions. Moreover, this study validates our measurements with domain experts, which shows there exists a strongly consistent relationship. This paper provides an automated method to solve the challenging problem of task complexity measurement. It not only broadens the crowdsourcing study about task characteristics and participant behavior, but also provides a novel perspective for platform managers to conduct a multi-level analysis about task complexity and optimize resource allocations between task complexity and other task characteristics.

Key words: topic model, tournament-based crowdsourcing, task complexity, text mining, machine learning