View
0
Download
0
Category
Preview:
Citation preview
지역산업연구Ⅰ제42권 제4호Ⅰpp. 235~256
The Effect of the Introduction of Text Mining Analysis on the Accounting and Finance Research and Proposals for Graduate School Education
?Lee, Kun Chang*․Na, Hyung Jong **
5
ABSTRACT
This paper addresses the impact of text mining techniques on the studies of accounting and finance sectors, and puts forward appropriate teaching methods accordingly. As we all know, traditional methods of teaching and researches in the accounting and financial areas have been limited to only analyzing quantified data samples and henceforth focusing on empirical analyses, ignoring AI methods such as text mining and sentiment analysis techniques. The reasons why AI mtehods must be taken into serious consideration as new topics of education and research are that many kinds of texts existing in audit reports, financial statements, and financial news are flooding decision makers in the accounting and financial fields. Therefore, this paper strongly proposes that text mining techniques, one of AI methods actively used in MIS fields, should be taught and adopted as one of AI methods in these fields. Besides, the text mining techniques need to be implemented into future curriculum of the graduate courses for accounting and financial studies. In this sense, the contribution of this paper is as follows. Firstly, the need to introduce text mining techniques in order to analyze huge amount of unstructured texts available in accounting and financial documents was emphasized. Secondly, the fundamental concepts and procedures related to text mining techniques were described in detail.
Thirdly, the old-fashioned curriculum of the current graduate courses in the accounting and financial areas should be changed and updated to represent the latest trends of AI-driven revolution occurring in modern business world.
|Keywords| text mining, accounting, finance, graduate education5
Ⅰ. Introduction
This study examines the impact of text mining technology, which is an unstructured text big data
analysis method among AI (Artificial Intelligence) technologies, on accounting and financial sector
research, and tries to present appropriate graduate school education curriculum. 1)
* (First Author) Professor, SKKU Business School, Sungkyunkwan University, Seoul 03063, Republic of Korea, kunchanglee@gmail.com
** (Corresponding author) Research Professor, SKKU Business School, Sungkyunkwan University, Seoul 03063,
Republic of Korea, fresh_na_77@hanmail.net
236 지역산업연구|제42권 제4호|2019.11
Researches in the accounting and finance areas are conducted primarily through empirical analysis.
Most of the samples use only quantified data in empirical analyses, and the results are derived primarily
from regression analyses. Of course, there are studies that present improvement measures for taxation in
terms of tax law or present new policies or regulations in the field of audit without empirical analysis,
but this paper discusses the need for text mining techniques and graduate school education curriculum
for those papers that study the accounting and finance sectors through empirical analysis.
Recently, we can collect information stored in various sources of semi-structured and unstructured
data with the support of big data technology. (Pezić et al., 2019). The limitation of the research
methodology in accounting and finance is that only quantified data has been used for empirical analyses.
This threshold also limits the scope of accounting and financial research. For more diverse and new
research, it is necessary to try convergence by introducing research methodologies in different academic
fields. Management Information System (MIS) has long been working on text mining techniques. In the
accounting and finance sectors, however, research is still insufficient to introduce and apply these
artificial intelligence technologies.
Therefore, the field of accounting and financial research needs to study more diverse research topics
by introducing text mining techniques to analyze unstructured text data. The following is a look at how
text mining techniques are available in accounting and financial research areas, and how they can be
used to make progress in accounting and financial research.
Among artificial intelligence technologies, text mining technology supports non-metered data such
as text to be available for empirical analysis research. In other words, unstructured data such as text
data can be quantified and variable through text mining techniques.
A great deal of information about firms cannot be expressed in quantitative terms. The firm's
financial statements represent quantitative information about the corporate's financial condition and
performance, and in accounting and financial sector studies, only quantitative data in financial
statements are used in the empirical analysis. Materials such as business reports and audit reports
express corporate information with qualitative data such as text. These qualitative data also contain a lot
of important information about companies, but so far it has been rarely used for empirical analyses.
If text mining techniques are introduced among artificial intelligence technologies, texts in business
reports or audit reports can be quantified and used for empirical analysis. In other words, research data
available will be expanded by the introduction of new methodologies. This will help pioneer areas that
have not been studied so far.
However, there are real barriers to using text mining techniques, one of the artificial intelligence
technologies, in accounting and financial research. Currently, researchers in accounting and finance area
are not trained in these artificial intelligence technologies, so it is practically impossible to use text
The Effect of the Introduction of Text Mining Analysis on the Accounting and Finance Research and Proposals for Graduate School Education / Lee, Kun Chang*·Na, Hyung Jong * 237
mining techniques in research. The fundamental solution to this problem lies in reforming the
curriculum. In other words, it is judged that education should be made in the graduate school master’s
course or Ph.D. course by adding a curriculum on text mining technology education to the accounting
and financial curriculum.
Professors in accounting and finance in the current business are not really an educated generation of
artificial intelligence technologies such as text mining, thus they cannot teach artificial intelligence
technology to potential researchers such as masters and Ph.D. students. Therefore, in order to break
down this situation, it is necessary to continuously train for master's and doctorate courses by recruiting
outside text mining experts. Then, as time goes by, the percentage of professors majoring in accounting
and finance who have acquired text mining skills will increase.
This study introduces the education curriculum of graduate school for text mining techniques that
can help with accounting and financial research. When accepting innovative new things without fear of
change, the areas of accounting and financial research can move forward one step further. This will
essentially require future accounting and finance researchers to educate them about text mining, one of
the artificial intelligence technologies.
The contributions of this paper are as follows. First, we insist changes in new research flows and
explained the need to introduce text mining techniques for unstructured text data analysis in the study
of the accounting and finance sectors as a way to counter them. Second, the concepts and application
methods of text mining techniques that are substantially helpful for the study of the accounting and
finance sectors are introduced in detail. Third, as a fundamental solution for the use of text mining
technology in the study of the accounting and finance sectors, specific solutions are presented to
education curriculum.
This study consists of the following sequence: Chapter 2 points out the limitations of research in the
accounting and finance sectors and describes the current changing research environment and trends.
Chapter 3 introduces prior literatures utilizing text mining techniques in the fields of accounting and
financial research. Chapter 4 explains the concepts of text mining techniques and describes specific
procedures for application to research in the accounting and finance sectors. And Chapter 5 presents
education measures for artificial intelligence technologies. Finally, in Chapter 6, we conclude.
Ⅱ. Limitations of research methods and changing research flows in accounting and finance
Research in accounting can be largely divided into financial accounting, management accounting,
taxation accounting, and auditing areas. In the field of taxation accounting, it is possible to point out
238 지역산업연구|제42권 제4호|2019.11
the problems of the current taxation system in terms of tax law and suggest improvement measures
without making a empirical analysis. It is also possible to present new policies or regulations in the field
of audit, pointing out problems in the existing audit system. However, most accounting studies use
methodologies to verify hypotheses through empirical analyses.
The research in the financial field studies stock prices or the firm's capital structure, derivatives and
various financial instruments, and capital increases etc. These studies, like in the accounting field, use
data about companies, and also mostly use methodologies to verify hypotheses through empirical
analyses. In this paper, the limitations of these empirical analysis studies are discussed, and
countermeasures are presented.
The limitations of empirical analysis studies in the accounting and financial sectors are as follows:
The scope of the study is limited, because the samples used in the field of empirical analysis in
accounting use only quantified quantitative data. The data used in accounting and finance are primarily
firm-year financial data and are obtained from KIS-Value (https://www.kisvalue.com), TS-2000
(http://www.kocoinfo.com), Fn-Guide (http://www.fnguide.com).
The firm's financial data is obtained from these open sources through downloads and other data is
collected directly by hand-collecting. Both the data collected through downloads and the data collected
through hand-collecting are common, and for statistical analysis only quantified data are used for
research, which in turn limits the scope of the study.
When a company's various information is disclosed externally, it is expressed not only quantitatively
but also qualitatively. The firm discloses the corporate's financial information quantitatively through its
financial statements and qualitatively through its business reports, audit reports or management
diagnosis statements and so on. Up until now, in accounting and finance studies, information expressed
qualitatively by a company (e.g., text data in a business report or text data in an audit report) could not
be used in the empirical analysis. This results in limiting the research topic to be studied in the
accounting and finance sectors. As an alternative to overcoming these limitations in the field of
accounting and financial research, this study proposes the introduction of text mining technology, one
of the artificial intelligence technologies.
Artificial intelligence technology has long been studied, and there have been attempts to apply it to
accounting and financial research. However, attempts to utilize artificial intelligence technology in
accounting and finance studies have recently been booming overseas. The following <Table 1> is on
the compilation of Scopus papers in the accounting and finance sectors based on data analysed by using
artificial intelligence technology (Cockcroft et al., 2018).
The Effect of the Introduction of Text Mining Analysis on the Accounting and Finance Research and Proposals for Graduate School Education / Lee, Kun Chang*·Na, Hyung Jong * 239
<Table 1> The number of Scopus papers in accounting and finance using big data analysis techniques among artificial intelligence technologies
Sources: Cockcroft et al. (2018)
The above <Table 1> include researches in the financial sector as well as in the accounting sector.
And it's a data count based on Scoupus papers. The number of research studies that analyzed big data
has increased since 2012, and it can be seen that research on big data analysis using artificial intelligence
technology has increased rapidly since 2013.
Indeed, in overseas accounting and finance studies, the number of papers applying these technologies
is on the rise. Research in accounting and finance is also being published in Korea, but it is not actively
studied as it is still being studied abroad. We introduce studies using text mining technology in the
accounting and finance area as below.
Ⅲ. Prior literatures in accounting and finance field using text mining technology
We introduce the accounting and financial sector studies for empirical analysis to quantify
unstructured text data using text mining techniques among artificial intelligence technologies.
Overseas, the accounting and financial sector studies used to analyze textual data, which is unstructured
big data, have conducted empirical analyses directly, and there are many papers that mention the
importance or explain the procedures. (Gupta and Lehal, 2009; Zhang et al., 2011; Al-Maimani et al.,
2014; Noh and Lee, 2015; Ravi and Ravi, 2015; Wu and Ester, 2015; Költringer and Dickinger, 2015;
Anand and Naorem, 2016; Campos et al., 2018).
The major overseas accounting and finance papers relevant to text mining are as follows. Fisher et al.
(2016) note that textual documents are often used to convey information such as management's
assessment of firms' financial performance, current and future corporate performance. Therefore, in
YearThe number of Scopus papers in accounting and finance
using big data analysis techniques 2007 82008 212009 332010 312011 892012 6602013 2,4292014 4,7632015 8,4122016 10,715
240 지역산업연구|제42권 제4호|2019.11
order to analyze these texts, they emphasized the importance of collecting and analyzing information
through Natural Language Processing (NLP), and mentioned the need for research in this field. Gaultz
and Mayo (2017) predict that big data analysis would further improve the audit environment. Big data
analysis helps identify associations between various information, analyze patterns of information, and
recognize numerous information. Rezaee and Wang (2019) describe the importance of a system that
uses text mining techniques to predict risks in advance. It said that it could establish a more effective
management system through risk assessment by analyzing documents such as audit reports and
corporate analysis reports as well as financial statements. Janvrin and Watson (2017) explain that the
role of providing enterprise information to internal and external decision makers was more effective and
efficient by utilizing big-data analysis. And it actually introduces software and analysis methods for
these big data analytics and a variety of examples. Byrnes et al. (2018) predict that since unstructured
big data analysis is possible through text mining technology, this would lead to a gradual reduction in
human weight in the audit process and an increase in the role of artificial intelligence systems. Boskou
et al. (2018) analyze the text in the financial report by text mining. The key words in the document
were extracted and the associations between similar words were analyzed to develop new indicators for
internal control functions. Wu et al. (2019) use text mining techniques to analyse the information
usefulness of news predicting the return of stock prices in Taiwan stock market. News variables have
proved to provide useful information in predicting Taiwan stock market returns. They found the
asymmetrical effect of economic news predicting stock market returns by producing results that
predictive accuracy is higher when the stock market is in recession than when it is booming. Yang et al.
(2018) analyze the enterprise's text reports using text mining techniques and extracted the firm's risk
factors through this process. Cheng et al. (2019) study factors that improve the accuracy of stock
valuations using text mining methods. Based on the correlation of key words found in text mining
analysis, a factorial relationship model was established.
The accounting and finance domestic papers using text mining are as follows. Lee and Kwon (2015)
used text mining techniques to extract the news article about the social responsibility of medium-sized
enterprises and analyze their association with corporate performance. Choi et al. (2015) utilized text
mining techniques to analyze news of companies that had gone bankrupt in the past. Thus, the
possibility of firm's prediction of bankruptcy was studied. The study found that words like 'recovery,
disclosure, funding, workout, embezzlement, capital increase, and creditors' were heavily exposed to the
news about two months before the bankruptcy. Yuk (2018) analyzed CEO messages in sustainable
reports using text mining techniques to study the relevance between CEO messages in sustainable
reports and corporate performance. The study reported that companies with lower corporate
performance tend to use more positive expressions in sustainable reports, have less readability, and
The Effect of the Introduction of Text Mining Analysis on the Accounting and Finance Research and Proposals for Graduate School Education / Lee, Kun Chang*·Na, Hyung Jong * 241
emphasize future performance. Yoo et al. (2018) selected bus-related complaints from 10,421 electronic
civil data during the 2015-2017 period and extracted the key words through text mining analysis. And
the degree of connection and the network between the major words was analyzed through the
association analysis. These studies provide data on countermeasures to reduce the number of bus
complaints. Kim and Kim (2017) analyzed the published textual data of the stock market analysis site
to derive information on the sensitivities of stock market investors. Jang et al. (2016) investigated the
titles of the analyst's investment reports using text mining techniques for the accuracy and achievement
of the analyst's forecast. The analysis results show that the more opinions of buying and selling are
reflected in the title, the more accurate and achievable the financial analyst is. Na et al. (2018)
examined company posting on the homepage of U.S. S&P500 companies through text mining. The
analysis found that certain key words have a systematic link to the firm's present and future financial
performance. Kim et al. (2012) analyzed the contents of the news through text mining and analyzed the
relationship between stock prices. The analysis found that there was a significant relationship between
positive and negative news content and the rise and fall of the stock index, especially the accuracy rate
was 70.0% when the stock price fell and 78.8% when the stock price went up. Na et al. (2019) noted
that companies that received appropriate feedback in the audit report did not have the same audit risk
and proved it through a empirical analysis. The content of the audit report was analyzed through text
mining to calculate the quantified sensitivity value according to the degree of positive or negative, and
the relationship between the audit value and the audit repair and audit time was analyzed. The analysis
results show that the sensitivity value of the audit report has a significant negative relationship with the
audit maintenance and audit time. This means that not all of the appropriate opinions in the audit
report are appropriate opinions that represent the same level of audit risk. Mo and Seo (2019) examined
and quantified the annotations of an enterprise's business report using a text mining technique to study
the volatility of the notes and the response of the stock market. According to the research results, the
contents of the notes in the enterprise's business report are reported to have a positive effect in a way
that increases the usefulness of information transfer, reducing the level of information matching of
equity capital expenses, stock trading volume, and profit response factors.
Ⅳ. Text mining technology for unstructured text big data analytics
(1) The concept of big data
Instead of simply understanding big data as a 'big + data' simple compound word, the Gandomi
and Haider (2015) explain that it is difficult to actually use in data structures such as structured data
242 지역산업연구|제42권 제4호|2019.11
and semi-structured data that is not yet available in the business environment (e.g., photography,
image, and text). In other words, the concept of big data means that although there is a large amount of
information, the forms of information vary, not only are much more information available than the
amount of information previously provided, but the type of information that was not available has also
become available.
Lee (2017) mentioned big data as follows: The characteristics of big data were described in terms of
size, variety, and velocity. First, size means the physical size of the data. The size of Big Data represents
data that has been extended to petabytes (PB)1). Second, diversity means the form of data. Depending
on whether the data is stored in a relational database (RDB)2) used in a traditional business data
environment, whether it is web log data, or unstructured data such as video, image, or text. Third,
velocity means data processing power. The process of acquiring, processing and analyzing data should
be handled in real time.
In other words, big data does not simply mean large-capacity data itself, but rather is a term that
places more importance on technologies that can effectively process and analyze various data. These big
data analytics technologies are designed to extract economically valuable data from large-capacity data
composed of various forms, such as M2M (Machine to Machine) sensor data, social data, and corporate
customer relationship data. It is important to effectively analyze not only structured data, but also
unstructured big data such as web documents on the Internet, text data on social media, e-mail and
videos on YouTube. The volume of big data was estimated to be around 2.8 zettabytes in 2012, and
will grow rapidly to around 40 zettabytes3) in 2020, with 20% of that expected to be structured data
and the remaining 80% to be unstructured data (Yoo, 2013).
<Table 2> below compares past and present data concepts. In the past, the form of data was
consistent with a particular format, but now there are no specific formats and they vary. The speed of
data has been a batch of data collected over a period of time, but it now processes data that continues to
occur in real time. In terms of the processing costs of data, it used to be relatively expensive, but now it
is relatively inexpensive. The purpose of processing data has been to analyze the results of past data in
the past, but now the focus is on system optimization or future prediction indicators development
through data analysis.
1) 1PB is 1,024TB (Tera Byte) as a measure of the throughput or capacity of a digital signal.2) A database that manages data using a two-dimensional table. A two-dimensional table refers to a table using
rows and columns, and the advantage of RDB is that data can be managed in a form that is easy for people to
understand. In addition, data can be controlled without using a separate programming language. In other words,
the ability to manipulate data, even if not an expert, has led to an increase in the users of the database.3) A combination of the prefix "zeta" meaning 10 to 21 square and "byte," a unit representing the amount of
computer data. A number equal to 10 years (垓) with 21 zeros per 1.
The Effect of the Introduction of Text Mining Analysis on the Accounting and Finance Research and Proposals for Graduate School Education / Lee, Kun Chang*·Na, Hyung Jong * 243
<Table 2> Changes of data concepts
Category Data concepts in the past Data concepts at the present
Data form constant to specific format No specific form, no variety
Data velocityConstant batch level: collected at regular
intervals data
Real time level: Handle continuous data
right away
Data processing
costrelatively high cost relatively low cost
Dataprocessing purpose
result analysis system optimization and future forecasts
Sources: Sivarajah et al. (2017)
At this point in time, artificial intelligence technology, which can be used most effectively in
accounting and finance research, is considered to be text mining. Below we introduce the concepts and
specific procedures of text mining techniques.
(2) Procedure and application of text mining techniques
Text mining is a technique for formatting semi-orthogonal or unstructured text materials based on
natural language processing (NLP)4). It is also an analytical method that allows useful information to
be found from features extracted from text. Text mining techniques allow you to analyze text data
among unstructured big data to extract significantly repeated keywords. This technique also identifies
associations between key words and groups words with common characteristics (Xiao et al., 2018). Text
mining can also present the words in a document in order of importance, or it can be represented by
larger, darker visualizations in an important order <Figure 1> below is an example of Word Cloud.
Word cloud, which is presented by Na et al. (2019), is expressed in large and thicker terms for the most
important words for the 2017 audit reports of KOSPI companies, and in contrast, smaller and more
flowing words for those that do not matter.
4) It is a technology that recognizes and processes the language that we use. That is, it allows us to recognize each
other's language and the language used by computers.
244 지역산업연구|제42권 제4호|2019.11
Sources: Na et al. (2019)
<Figure 1> 2017 KOSPI companies audit report word cloud
Meanwhile, the procedure for text mining is specifically described below. First, extract the necessary
text in the document using the Back of Word (BoW: Bag-of-Words)5) method. In this process,
research and unnecessary vocabulary are eliminated.
Second, create a TF-IDF (Term Frequency-Inverse Document Frequency) matrix that shows how
significantly words extracted using the Back of Word method are repeated in a document. TF-IDF is an
adjacent sequence between documents and text and is used primarily as a text mining vectorizing6)
method. TF-IDF is a relative value that does not simply mean the number of times the word is repeated
within the document, but how significantly the word is repeated in the document. TF-IDF is a kind of
weight used in text mining techniques, a statistical value that means how important the word is in a
document (Ramos, 2003; Paltoglou and Thelwall, 2010).
Third, to this stage, we can extract a significantly repeated keyword from the document, as we followed
the basic steps of text mining. Keywords allow you to grasp the key content and its meaning of the
document. <Figure 2> below schematizes the text mining procedure.
5) BoW considers the text used in a document as a set. It is a commonly used vector representation to determine the
number of such text regardless of the order of the text in the document. BoW helps to characterize the
document by extracting important words from unstructured text sources (Eo and Lee, 2019).6) With text mining techniques, words in a document are represented by vectors, and when words exist on the
document, they have vector values. Vectors are the weighting of words, and a typical method of obtaining this
value is the TF-IDF method (Kåebek et al., 2014).
The Effect of the Introduction of Text Mining Analysis on the Accounting and Finance Research and Proposals for Graduate School Education / Lee, Kun Chang*·Na, Hyung Jong * 245
<Figure 2> Text mining procedure
Sources: Gaikwad et al. (2014).
Opinion mining or Sentiment mining, a kind of text mining, provides us with information, even the
opinions and feelings implied within the text. If you want to further analyze your comments or feelings
about the document, you should follow the steps below. First of all, the extent to which the degree of
positive, neutral, and negative should be expressed should be specified. For example, if the maximum
value is +2 and the minimum value is –2, the more positive the word is, the closer the number is, the
more negative the number is, the closer the number –2. And the more neutral the word will be
expressed in numbers close to zero. <Table 3> below is the emotional vocabulary dictionary in the
audit report presented by Na et al. (2019), which quantifies the words in the audit report according to
the degree of positive, neutral, or negative in the context.
<Table 3> audit report emotional vocabulary dictionary
Degree of positive,
neutral, or negativeScore Words in the audit report
Negative
-3
Deficit Legal action
Damage Bad money
Difficulty Abort
-2
Reduction Loan-loss reserves
Expiration Assurance
Delay
-1
Expense Income tax
CompensationDivision
Normalization Politics
246 지역산업연구|제42권 제4호|2019.11
Source: Na et al. (2019)
If a dictionary of emotional words already exists for the document, it can be used for emotional
analysis. However, although there are currently many English-language sensitive language dictionaries,
most Korean-language versions of emotional dictionaries have yet to be established. When using
opinion mining or sentiment mining techniques, if there is no emotional language dictionary,
researchers should refer to a number of experts in the field to establish an emotional dictionary of it and
verify it before using the data (Yoo et al., 2013). For example, Na et al. (2019) collected opinions from
five experts in the field of audit to establish an emotional language dictionary on audit reports of Korean
KOSPI companies, and verified consistency through the ICC (Intraclass Correlation Coefficient) test.
Generally, more than 80% of the respondents' opinions are considered to be consistent.
Text mining techniques have a variety of techniques. The technologies like information extraction,
summarization, categorization, clustering, and information visualization, are utilized in the text mining
process. These various technologies can be useful tools for financial and accounting research. <Table
4> below describes the techniques that can be used in text mining.
Degree of positive,
neutral, or negativeScore Words in the audit report
Neutral 0
Individual Result
ClassificationImpact Budget
Measurement Efficiency
Positive
1
Development Issuing
New building Take over Normal Increase
2
Sales New
Surplus Early
Propulsion
3
Improvement Stability Closing
Dismissue
The Effect of the Introduction of Text Mining Analysis on the Accounting and Finance Research and Proposals for Graduate School Education / Lee, Kun Chang*·Na, Hyung Jong * 247
<Table 4> Techniques in text mining
Techniques utilized in text mining Explanation about techniques of text mining
Information ExtractionInformation extraction is initial step for computer to analyze unstructured text by identifying key phrases and relationships within text.
Categorization Categorization automatically assigns one or more category to free text document. Categorization is supervised learning method because it is based on input output examples to classify new documents.
Clustering Clustering method can be used in order to find groups of documents with similar content
Visualization In text mining visualization methods can improve and simplify the discovery of relevant information
Summarization
Text summarization is to reduce the length and detail of a document while retaining most important points and general meaning. Text summarization is helpful for to figure out whether or not a lengthy document meets the user’s needs and is worth reading for further information hence summary can replace the set of documents
Sources: Gaikwad et al. (2014)
Ⅴ. Proposals of education curriculums in accounting and finance filed for text mining technology
The accounting and finance sectors study entities that are not individuals. Information about
companies is very diverse and extensive. Such firm information may be disclosed as quantitative data,
such as financial statements, but may also be disclosed as qualitative data, such as business reports and
audit reports. Until now, empirical studies in the accounting and financial sectors have been conducted
using only quantitative data, since statistical analysis must be performed. This results in limiting the
scope of the study.
Text mining techniques help to quantify unstructured text big data for empirical analysis. This is to
study qualitative data that has not previously been used as research material. By enabling, it can play an
important role in expanding the scope of research. Therefore, I think the use of text mining techniques
should be more common for the development of accounting and financial research.
In order to realize this, the fundamental solution is to provide education for graduate school masters
and Ph.D. The following are education plans for text mining education in graduate school master's and
Ph.D.
First, classes on the use of text mining techniques should be added to the graduate education
curriculum. To do this, both theoretical and practical classes are necessary. As a theoretical class, classes
248 지역산업연구|제42권 제4호|2019.11
that explain the content of "utilization of unstructured text big data" are basically necessary. A
theoretical explanation and understanding of why textual data is needed and where it can be used will
have to be preceded. And while the field of accounting and financial research primarily uses SAS or
STATA statistical programs as a hands-on class, text mining requires a" class on R or Python". It is
necessary for researchers to initially collect the necessary text material to crawl into R or Python, since it
is also more efficient to use R or Python to process the collected text data and derive the TF-IDF value.
Statistical programs such as SAS and STATA can be used for empirical analyses, such as regression, after
quantifying text data.
Second, incumbent professors in the accounting and finance fields who teach graduate master's
degree Ph.D. students have not been educated on text mining in the past, so it is difficult to teach these
courses in reality. Therefore, at this point, we need to solve this problem by hiring professors who can
teach outside classes on text mining. However, these methods will allow students with text-mining
training to become professors and nurture future students after a certain period of time.
Like the business environment changing with artificial intelligence technology, the research
environment is changing due to artificial intelligence technology. The accounting and finance sectors
should also detect these changes in the research environment and come up with appropriate
countermeasures. Text mining, an atypical text big data analyzer, is considered the most useful artificial
intelligence technology in accounting and financial research.
<Table 5> below is the artificial intelligence management class curriculum at OO graduate school.
Through this class curriculum analysis, we present a curriculum that can be practically introduced by
the accounting and financial sector oligopoly. Looking at artificial intelligence management major
classes presented below, the first semester is the opening of a theory class called artificial intelligence
and business to explain the changing business environment and to educate students about the need to
introduce artificial intelligence technology. In the second semester, the basic statistical program classes
will be taught on how to use SAS. And at the time of the third semester, start practical training on R or
Python in earnest to take practical classes on text mining, and at the same time open classes on artificial
intelligence and accounting and finance to train how text materials can be applied and used in
accounting and finance research. In the fourth semester, more in-depth R or Python training will help
students learn the skills to quantify text. In the fifth semester, it has completed practical training on
text mining by educating on opinion mining and sentiment mining techniques through the class of text
big data analysis and practice. At the same time, it has completed the entire curriculum by teaching
solutions to social problems, countermeasures, and ethics that may arise due to artificial intelligence
through the class of 4th industrial revolution and social responsibility.
The Effect of the Introduction of Text Mining Analysis on the Accounting and Finance Research and Proposals for Graduate School Education / Lee, Kun Chang*·Na, Hyung Jong * 249
<Table 5> Example of a graduate course in artificial intelligence management
Too many classes may not be allocated to text mining education in graduate school classes in
accounting and finance, but I think it should be reorganized so that at least one theoretical class and one
practical class can be heard. Classes such as‘artificial intelligence business’and‘artificial intelligence and
accounting and finance’are suitable for theoretical class, and classes such as‘artificial intelligence
programming1 (R and Python)’and‘text big data analysis and practice’are suitable for practical class. In
addition to the existing basic statistical classes, it is necessary to train students to utilize text-manning
through additional statistical classes using R and Python.
Overseas studies in the accounting and finance sectors, as shown in <Table 1> of this study, show a
growing trend in big data analytics. In order to keep up with this global research trend, we need to
reform the graduate school curriculum to improve the research skills of the next generation.
VI. Conclusion
This paper explains the impact of text mining techniques, which are unstructured text big data
analysis methods, on the research of accounting and finance sectors, and puts forward appropriate
Semester Subject Descriptions of the subject
the first semester
· Artificial intelligence businessResearch on the impact of artificial intelligence on the business as a whole, and learn about future responses to it.
the second semester
· Practice of management statistics (SAS)
Learn about and practise the appropriate SAS as a statistical program for handling big data.
the third semester
· AI programming 1 (R, Python)Practice text mining techniques for programming using R and Python.
· Artificial intelligence, accounting and finance
Use artificial intelligence technology to learn examples and techniques that apply to the accounting and finance sectors.
the fourth semester
· AI programming 2 (R, Python)Practice machine learning techniques for programming using R and Python.
· Artificial intelligence and marketing
Apply artificial intelligence technology to learn cases and techniques applied in the marketing field.
the fifth semester
· Text big data analysis and practice
Learn how to extract the key words from the text big data and how to obtain the constant value.
· The 4th Industrial Revolution and Social Responsibility
Learn about the problems that may arise from the introduction of artificial intelligence technology in the 4th Industrial Revolution and the need for ethical regulation and social responsibility.
250 지역산업연구|제42권 제4호|2019.11
teaching methods to introduce them in accounting and finance sectors.
It pointed out that the scope of the study was also limited because only quantified data was used as a
limitation of empirical analysis studies in the accounting and financial sectors. It argued that for the
development of research in the accounting and finance sectors, it is necessary to try convergence by
introducing useful research methodologies in other academic fields. Taking text mining techniques, a
technique actively used in the field of management information systems, is necessary for a new
curriculum of graduate school master and Ph.D. should be implemented in order to develop accounting
and financial research.
To summarize the need to introduce the text mining techniques claimed in this paper: Among
artificial intelligence technologies, text mining technology can help use non-metered data such as text
for empirical analysis research.
A large amount of information in an enterprise cannot be expressed in quantitative terms. Firms'
financial statements describe its financial position and performance as qualitative information. In
accounting and finance studies, in fact, almost all quantitative data in financial statements is used for
empirical analysis. Business reports, audit reports, etc. express important information of the enterprise
with qualitative data such as text.
If text mining techniques are used, qualitative data such as text data can be quantified and therefore
can be used for empirical analysis. This means an expansion of available research data and an expansion
to the extent that it can be studied. In other words, text mining can be an important means of studying
many areas that have not yet been studied.
Currently, professors in the accounting and finance fields are not an educated generation of artificial
intelligence technologies such as text mining technology, so it is practically impossible to apply this
method to research or directly educate students to foster graduate school studies. However, for the
development of accounting and financial field research and the subsequent development of researchers,
training in text mining, a technique for analyzing unstructured text materials among artificial
intelligence technologies, is required during the graduate school education course.
For now, the government should continue to provide education on this in its master's and Ph.D.
courses by recruiting outside text mining experts. Over time in the future, the proportion of professors
majoring in accounting and finance who have acquired text mining skills will gradually increase. At
least one theoretical class and one practical class on text mining should be opened in the graduate school
education curriculum of accounting and finance majors to help keep up with changing global research
trends. When accepting innovative new things without fear of change, the areas of accounting and
financial research can be further advanced. To do this, graduate students, who are essentially future
accounting and finance researchers, should be educated about text mining technology, one of the
The Effect of the Introduction of Text Mining Analysis on the Accounting and Finance Research and Proposals for Graduate School Education / Lee, Kun Chang*·Na, Hyung Jong * 251
artificial intelligence technologies.
The contribution of this paper is as follows. First, the need to introduce text mining techniques for
unstructured text big data analysis was explained as a way to counter the need for new changes in
accounting and financial research areas. Second, the concepts and procedures of big data and text
mining techniques were introduced in detail. Third, as a fundamental solution for the use of text mining
technology in the study of the accounting and finance sectors, we proposed a concrete solution to this by
insisting on a reform of the graduate school education.
■ 논문투고일 ■ 논문 최종심사일 ■ 논문게재확정일
2019. 10. 112019. 10. 302019. 11. 08
252 지역산업연구|제42권 제4호|2019.11
참고문헌
김유신·김남규·정승렬(2012), “뉴스와 주가: 빅데이터 감성분석을 통한 지능형 투자의사결
정모형. 지능정보연구,” 한국지능정보시스템학회, 18(2), 143-168.
김재봉·김형중(2017), “주가지수 방향성 예측을 위한 도메인 맞춤형 감성사전 구축방안,”
한국디지털콘텐츠학회 논문지, 18(3), 585-592.
나형종·최석재·권오병(2018), “The Association of Institutional Information on Websites with
Present and Future Financial Performance,”한국전자거래학회지, 23(4), 63-85.
나형종·이건창·최승욱·김성태(2019), “감사보고서의 비정형 내용분석과 감사보수 및 시간
을 이용한 감사의견의 적정성 연구: 텍스트 마이닝과 감성분석 기법 적용을 중심
으로,”회계학연구, 44(4), 175-214.
모예린·서윤석(2019), “주석 내용의 변동과 주식시장: 주석 내용의 변동이 자기자본비용과
주식거래량 및 이익반응계수에 미치는 영향,” 회계학연구, 44(4), 215-249.
어균선·이건창(2019), “효과적 이모션마이닝을 위한 속성선택 방법에 관한 연구,” 디지털
융복합연구, 17(3), 107-117.
유선희(2013), “빅데이터 기반의 산업시장 정보분석,” 한국과학기술정보연구원 보고서
유승의·홍순구·이태헌·김나랑(2018), “텍스트 네트워크 분석을 통한 부산시 버스민원 패턴
분석,” 전산회계연구, 16(2), 19-43.
유은지·김유신·김남규·정승렬(2013), “주가지수 방향성 예측을 위한 주제지향감성사전 구
축 방안,” 지능정보연구, 19(1), 95-110.
육근효(2018), “CEO 의 사회적 책임 메시지와 지속가능성 성과의 관계: Text Mining 접근법
의 활용,” 회계저널, 27(1), 253-279.
이희승·권오병(2015), “텍스트마이닝을 활용한 기업의 CSR 요인 추출과 기업 성과와의 관
계 분석,” 한국 IT 서비스학회 학술대회 논문집, 577-580.
장준규·이규현·이준기(2016), “투자전략 보고서의 제목이 주가 예측에 미치는 영향: 텍스트
마이닝 중심으로,” 한국빅데이터학회지, 1(2), 21-34.
최정원·한호선·이미영·안준모(2015), “텍스트마이닝 방법론을 활용한 기업 부도 예측 연
구,”생산성논집 (구 생산성연구), 29(1), 201-228.
Al-Maimani, M., N. Salim, and A. M. Al-Naamany. A. M.(2014), “Semantic and fuzzy aspects of
opinion mining,” Journal of Theoretical and Applied Information Technology, 63(2),
330-342.
The Effect of the Introduction of Text Mining Analysis on the Accounting and Finance Research and Proposals for Graduate School Education / Lee, Kun Chang*·Na, Hyung Jong * 253
Anand, D., and D. Naorem(2016), “Semi-supervised aspect based sentiment analysis for movies using
review filtering,” Procedia Computer Science, 84, 86-93.
Boskou, G., E. Kirkos, and C. Spathis.(2018), “Assessing internal audit with text mining,” Journal of
Information and Knowledge Management, 17(2), 1-22.
Byrnes, P. E., A. Al-Awadhi, B. Gullvist, H. Brown-Liburd, R. Teeter, J. D. Warren Jr, and M.
Vasarhelyi(2018), “Evolution of auditing: from the traditional approach to the future audit,”
In Continuous Auditing: Theory and Application Emerald Publishing Limited, 285-297.
Campos, R., Mangaravite, V., Pasquali, A., Jorge, A. M., Nunes, C., and Jatowt, A. (2018), “YAKE!
collection-independent automatic keyword extractor,” In European Conference on Information
Retrieval, Springer, Cham, 806-810.
Cheng, X., Huang, D., Chen, J., Meng, X., and Li, C.(2019), “An Investigation on Factors Affecting
Stock Valuation Using Text Mining for Automated Trading,” Sustainability, 11(7), 1938.
Cockcroft, S., and Russell, M.(2018), “Big data opportunities for accounting and finance practice and
research,” Australian Accounting Review, 28(3), 323-333.
Fisher, I. E., Garnsey, M. R., and Hughes, M. E.(2016), “Natural language processing in accounting,
auditing and finance: A synthesis of the literature with a roadmap for future research,”
Intelligent Systems in Accounting, Finance and Management, 23(3), 157-214.
Gaikwad, S. V., Chaugule, A., and Patil, P.(2014), “Text mining methods and techniques,”
International Journal of Computer Applications, 85(17).
Gandomi, A., and Haider, M.(2015), “Beyond the hype: Big data concepts, methods, and analytics,”
International journal of information management, 35(2), 137-144.
Goltz, N., and Mayo, M.(2017), “Enhancing regulatory compliance by using artificial intelligence text
mining to identify penalty clauses in legislation,” In MIREL 2017-Workshop onMIning and
REasoning with Legal texts June 16th.
Gupta, V. and G. S., Lehal(2009), “A survey of text mining techniques and applications,” Journal of
Emerging Technologies in Web Intelligence, 1(1), 60-76.
Janvrin, D. J., and Watson, M. W.(2017). “Big Data: A new twist to accounting,” Journal of
Accounting Education, 38, 3-8.
Kågebäck, M., O. Mogren, N. Tahmasebi., and D. Dubhashi(2014), “Extractive summarization using
continuous vector space models,” In Proceedings of the 2nd Workshop on Continuous Vector
Space Models and Their Compositionality, 31-39.
Koltringer, C., and Dickinger, A.(2015), “Analyzing destination branding and image from online
sources: a web content mining approach,” Journal of Business Research 68(9), 1836-1843.
254 지역산업연구|제42권 제4호|2019.11
Lee, I.(2017), “Big data: Dimensions, evolution, impacts, and challenges,” Business Horizons, 60(3),
293-303.
Noh, H., Y. Jo, and S. Lee.(2015),“Keyword selection and processing strategy for applying text mining
to patent analysis,” Expert Systems with Applications, 42(9), 4348-4360.
Paltoglou, G., and Thelwall, M.(2010), “A study of information retrieval weighting schemes for
sentiment analysis,” In Proceedings of the 48th annual meeting of the association for
computational linguistics, Association for Computational Linguistics, 1386-1395.
Pejić Bach, M., Krstić, Ž., Seljan, S., and Turulja, L.(2019), “Text mining for big data analysis in
financial sector: a literature review,” Sustainability, 11(5), 1277.
Ramos, J.(2003), “Using TF-IDF to determine word relevance in document queries,”In Proceedings of
the First Instructional Conference on Machine Learning, 242, 133-142.
Ravi, K. and V. Ravi(2015), “A survey on opinion mining and sentiment analysis: tasks, approaches and
applications,” Knowledge-Based Systems, 89, 14-46.
Rezaee, Z., and Wang, J.(2019), “Relevance of big data to forensic accounting practice and education,”
Managerial Auditing Journal, 34(3), 268-288.
Sivarajah, U., Kamal, M. M., Irani, Z., and Weerakkody, V.(2017), “Critical analysis of Big Data
challenges and analytical methods,” Journal of Business Research, 70, 263-286.
Wu, Y. and M. Ester(2015), “Flame: a probabilistic model combining aspect based opinion mining and
collaborative filtering,” In Proceedings of the Eighth ACM International Conference on Web
Search and Data Mining ACM, 199-208.
Wu, G. G. R., Hou, T. C. T., and Lin, J. L.(2019), “Can economic news predict Taiwan stock market
returns?,” Asia Pacific Management Review, 24(1), 54-59.
Yang, R., Y. Yu, M. Liu, and K. Wu.(2018), “Corporate risk disclosure and audit fee: a text mining
approach,” European Accounting Review, 27(3), 583-594.
Zhang, W., T. Yoshida, and X. Tang(2011), “A comparative study of TF-IDF, LSI and multi-words for
text classification,” Expert Systems with Applications, 38(3), 2758-2765.
The Effect of the Introduction of Text Mining Analysis on the Accounting and Finance Research and Proposals for Graduate School Education / Lee, Kun Chang*·Na, Hyung Jong * 255
국 문 요 약
텍스트 마이닝 분석 기법의 도입이 회계 및 재무 분야 연구에 미치는 영향과 이에 대한 대학원 교육방안 제시
이건창(Lee, Kun Chang)*․나형종(Na, Hyung Jong)**
본 논문은 비정형 빅데이터 분석방법 중 하나인 텍스트 마이닝이 회계 및 재무 분야 연구에 미치는 영향에 대해서 설명하고, 회계 및 재무 분야에 이를 도입하기 위한 적절한 교육 방안을 제시한다. 회계 및 재무 분야의 실증 분석 연구들의 한계점은 계량화된 자료만을 사용할 수밖에 없다는 점이고, 이는 연구의 범위를 제한한다. 회계 및 재무 분야 연구의 발전을 위해서는 정성적 자료인 문서들을 분석할 수 있는 텍스트 마이닝 기법을 도입해야 하고, 대학원 석사·박사 과정에서 이에 대한 교육과정을 신설하여 실시해야 한다. 본 논문의 공헌점은 다음과 같다. 첫째, 회계 및 재무 연구 분야에 새로운 변화의 필요성을 주장하며, 이에 대응하기 위한 방안으로 비정형 텍스트 빅데이터 분석을 위한 텍스트 마이닝 기법의 도입 필요성에 대해 설명하였다. 둘째, 빅데이터와 텍스트 마이닝 기법에 대한 그 개념과 절차들을 자세히 소개하였다. 셋째, 회계 및 재무 분야 연구에 텍스트 마이닝 기술이 활용되기 위한 근본적인 해결책으로써, 이에 대한 대학원 교육과정 개편을 주장함과 동시에 이에 대한 구체적인 개선방안을 제시하였다.
∣주요어∣ 텍스트 마이닝, 회계, 재무, 대학원 교육
1)
* 제1저자, 성균관대학교 경영대학 글로벌경영학과 교수, Professor, SKKU Business School, Sungkyunkwan University (kunchanglee@gmail.com) ** 교신저자, 성균관대학교 경영대학 연구교수, Research Professor, SKKU Business School, Sungkyunkwan
University (fresh_na_77@hanmail.net)
256 지역산업연구|제42권 제4호|2019.11
1. 주저자
이건창(Lee, Kun Chang): kunchanglee@gmail.com
현재 성균관대학교 경영대학 글로벌경영학과 교수로 재직 중이며, KAIST에서
경영학과에서 석사 및 박사학위를 취득하였다. 주요 관심분야는 인공지능과
경영의 융합 연구이다.
2. 교신저자
나형종(Na, Hyung Jong): fresh_na_77@hanmail.net
현재 성균관대학교 경영대학 연구교수로 재직 중이며, 경희대학교에서 회계
학 석사 및 박사학위를 취득하였다. 주요 관심분야는 Text mining과 Machine
learning, Deep learning을 활용한 회계분야 연구이다.
Recommended