Strategies in Computer-Assisted Text Analysis

Alan Brier, Elisabetta de Giorgi, Bruno Hopp

Research output: Working paper

Abstract

This paper reviews the logic of attempts to automate the processes involved in computer-assisted text analysis in the social sciences. Bayesian estimation methods in spatial analysis of variations in positions of political parties over time and Latent Dirichlet Allocation from the developing field of latent topic analysis are compared with the analysis of structures of word co-occurrences in the tradition of content analysis, using Procrustean individual differences scaling. Each depends in practice on concentrating attention on a limited number of word tokens regarded as meaningful while most are disregarded as inessential. By applying apparently competing strategies to the same set of party contributions to the 1997 budget debate in the Italian parliament, they can beshown to be complementary in character and should be applied as such in comparing material of this kind.
Original languageEnglish
PublisherNational Centre for Research Methods
Pages1-21
Number of pages21
Publication statusPublished - 19 Jul 2016

Fingerprint

text analysis
scaling
parliament
content analysis
budget
social science

Keywords

  • Text Analysis
  • Social Sciences

Cite this

Brier, A., de Giorgi, E., & Hopp, B. (2016). Strategies in Computer-Assisted Text Analysis. (pp. 1-21). National Centre for Research Methods.
Brier, Alan ; de Giorgi, Elisabetta ; Hopp, Bruno. / Strategies in Computer-Assisted Text Analysis. National Centre for Research Methods, 2016. pp. 1-21
@techreport{78e28309176e468aa3197c424299575e,
title = "Strategies in Computer-Assisted Text Analysis",
abstract = "This paper reviews the logic of attempts to automate the processes involved in computer-assisted text analysis in the social sciences. Bayesian estimation methods in spatial analysis of variations in positions of political parties over time and Latent Dirichlet Allocation from the developing field of latent topic analysis are compared with the analysis of structures of word co-occurrences in the tradition of content analysis, using Procrustean individual differences scaling. Each depends in practice on concentrating attention on a limited number of word tokens regarded as meaningful while most are disregarded as inessential. By applying apparently competing strategies to the same set of party contributions to the 1997 budget debate in the Italian parliament, they can beshown to be complementary in character and should be applied as such in comparing material of this kind.",
keywords = "Text Analysis, Social Sciences",
author = "Alan Brier and {de Giorgi}, Elisabetta and Bruno Hopp",
note = "info:eu-repo/grantAgreement/FCT/5876/147295/PT# UID/CPO/04627/2013, SFRH/BPD/78955/2011.",
year = "2016",
month = "7",
day = "19",
language = "English",
pages = "1--21",
publisher = "National Centre for Research Methods",
type = "WorkingPaper",
institution = "National Centre for Research Methods",

}

Brier, A, de Giorgi, E & Hopp, B 2016 'Strategies in Computer-Assisted Text Analysis' National Centre for Research Methods, pp. 1-21.

Strategies in Computer-Assisted Text Analysis. / Brier, Alan; de Giorgi, Elisabetta; Hopp, Bruno.

National Centre for Research Methods, 2016. p. 1-21.

Research output: Working paper

TY - UNPB

T1 - Strategies in Computer-Assisted Text Analysis

AU - Brier, Alan

AU - de Giorgi, Elisabetta

AU - Hopp, Bruno

N1 - info:eu-repo/grantAgreement/FCT/5876/147295/PT# UID/CPO/04627/2013, SFRH/BPD/78955/2011.

PY - 2016/7/19

Y1 - 2016/7/19

N2 - This paper reviews the logic of attempts to automate the processes involved in computer-assisted text analysis in the social sciences. Bayesian estimation methods in spatial analysis of variations in positions of political parties over time and Latent Dirichlet Allocation from the developing field of latent topic analysis are compared with the analysis of structures of word co-occurrences in the tradition of content analysis, using Procrustean individual differences scaling. Each depends in practice on concentrating attention on a limited number of word tokens regarded as meaningful while most are disregarded as inessential. By applying apparently competing strategies to the same set of party contributions to the 1997 budget debate in the Italian parliament, they can beshown to be complementary in character and should be applied as such in comparing material of this kind.

AB - This paper reviews the logic of attempts to automate the processes involved in computer-assisted text analysis in the social sciences. Bayesian estimation methods in spatial analysis of variations in positions of political parties over time and Latent Dirichlet Allocation from the developing field of latent topic analysis are compared with the analysis of structures of word co-occurrences in the tradition of content analysis, using Procrustean individual differences scaling. Each depends in practice on concentrating attention on a limited number of word tokens regarded as meaningful while most are disregarded as inessential. By applying apparently competing strategies to the same set of party contributions to the 1997 budget debate in the Italian parliament, they can beshown to be complementary in character and should be applied as such in comparing material of this kind.

KW - Text Analysis

KW - Social Sciences

M3 - Working paper

SP - 1

EP - 21

BT - Strategies in Computer-Assisted Text Analysis

PB - National Centre for Research Methods

ER -

Brier A, de Giorgi E, Hopp B. Strategies in Computer-Assisted Text Analysis. National Centre for Research Methods. 2016 Jul 19, p. 1-21.