This vignette demonstrates how to use the functions provided in the
conversation_multidyads.R
file to analyze conversations
across multiple dyads. These functions allow you to preprocess
conversation data and calculate various similarity measures between
conversation participants.
We’ll use the provided dataset “dyad_example_data.Rdata” located in the inst/extdata directory of the package:
data_path <- system.file("extdata", "dyad_example_data.Rdata", package = "conversim")
load(data_path)
# Display the first few rows and structure of the data
head(dyad_example_data)
#> # A tibble: 6 × 3
#> dyad_id speaker_id text
#> <dbl> <chr> <chr>
#> 1 1 A What did you think of the new movie that just came out?
#> 2 1 B I haven’t seen it yet. Which one are you referring to?
#> 3 1 A The latest superhero film. I heard it’s getting great revi…
#> 4 1 B Oh, that one! I’ve been meaning to watch it. Did you enjoy…
#> 5 1 A Yes, I thought it was fantastic. The special effects were …
#> 6 1 B Really? What about the storyline? I heard it’s a bit predi…
str(dyad_example_data)
#> tibble [532 × 3] (S3: tbl_df/tbl/data.frame)
#> $ dyad_id : num [1:532] 1 1 1 1 1 1 1 1 1 1 ...
#> $ speaker_id: chr [1:532] "A" "B" "A" "B" ...
#> $ text : chr [1:532] "What did you think of the new movie that just came out?" "I haven’t seen it yet. Which one are you referring to?" "The latest superhero film. I heard it’s getting great reviews." "Oh, that one! I’ve been meaning to watch it. Did you enjoy it?" ...
Before analyzing the conversations, we need to preprocess the text data:
processed_convs <- preprocess_dyads(dyad_example_data)
head(dyad_example_data)
#> # A tibble: 6 × 3
#> dyad_id speaker_id text
#> <dbl> <chr> <chr>
#> 1 1 A What did you think of the new movie that just came out?
#> 2 1 B I haven’t seen it yet. Which one are you referring to?
#> 3 1 A The latest superhero film. I heard it’s getting great revi…
#> 4 1 B Oh, that one! I’ve been meaning to watch it. Did you enjoy…
#> 5 1 A Yes, I thought it was fantastic. The special effects were …
#> 6 1 B Really? What about the storyline? I heard it’s a bit predi…
Now, let’s calculate various similarity measures for our preprocessed conversations.
Let’s visualize the results of our similarity analyses using ggplot2. Here’s an example of how to plot the topic similarity for each dyad:
topic_sim_df <- data.frame(
dyad = rep(names(topic_sim$similarities_by_dyad),
sapply(topic_sim$similarities_by_dyad, length)),
similarity = unlist(topic_sim$similarities_by_dyad),
index = unlist(lapply(topic_sim$similarities_by_dyad, seq_along))
)
ggplot(topic_sim_df, aes(x = index, y = similarity, color = dyad)) +
geom_line() +
geom_point() +
facet_wrap(~dyad, ncol = 2) +
labs(title = "Topic Similarity Across Dyads",
x = "Conversation Sequence",
y = "Similarity Score") +
theme_minimal() +
theme(legend.position = "none")