Keywords

1 Introduction

Unlike traditional resource curation (e.g., asking a colleague or turning to a one’s district or state department of education), the diffusion of information from social media to the classroom is significantly more efficient and scaleable. Hence, increasingly, teachers use social media to supplement their instructional resources [7, 8, 11, 16, 17]. According to a survey from RAND Corporation [11], more than 87% of elementary school teachers and 62% of secondary school teachers use Pinterest for professional purposes. Furthermore, since the coronavirus pandemic in 2020, instructional resources and homeschooling are in the top three of the most frequent searches within Pinterest (personal communication, April 1, 2020). Given the breadth of online activities by teachers, particularly within social media, and its direct diffusion into classrooms, it warrants understanding how teachers harness social media to diffuse their classroom ideas, lessons, and practices to a community of colleagues.

To study teachers in social media, researchers usually follow a bottom-up data collection approach where they first survey teachers offline and then retrieve their online data [2, 3, 12,13,14]. However, this common bottom-up data collection method is restrictive since there are potentially many other teachers online that are not included in our sampled and surveyed teachers. In other words, the surveyed teachers may not be representative of the population of teachers in online social media. Moreover, the survey process is usually costly and time-consuming. In this paper, we complement bottom-up data collection approaches by offering a scalable top-down approach. More specifically, we first survey 541 teachers across 5 U.S. states and 48 different districts and then using the surveyed teachers as the seed samples, we acquire their Pinterest data (a bottom-up approach). We then propose a top-down approach, building a heuristic that automatically identifies new likely teachers on Pinterest beyond our surveyed teachers. Finally, we use advances in machine learning and social network analysis to evaluate the performance of our heuristic.

2 Related Work

Research shows teachers use various online platforms for educational engagement including Facebook, Twitter, and Pinterest. Steinbrecher and Hart [14] showed that in addition to personal usage, teachers use Facebook for some professional activities such as “classroom support and strategy idea generation”. Authors in [3] explored Twitter usage by K-16 educators and discovered that many educators use Twitter for professional development. In similar studies [1, 4], it was shown that pre-service teachers use Twitter for some professional career development purposes such as resource sharing and connecting to other teachers. Carpenter et al. [2] indicated that teachers use Pinterest to promote educational materials. In particular, they discovered that many individuals were sharing resources curated in TeachersPayTeachers.com, a crucial virtual resource pool where teachers can sell/buy various educational resources. We have discovered similar results for TeachersPayTeachers.com and Pinterest. Some research has endeavored to identify who is curating educational resources. The authors in [13] explored the characteristics of teachers contributing to TeachersPayTeachers.com and attempted to identify the profile of resource curators. Similarly, Schroeder et al. [12] showed that teachers mostly utilize Pinterest to look for educational resources according to their classroom needs. Frank et al. [6] thoroughly analyzed the role of social networks and in particular Pinterest in providing emerging beneficial opportunities for education. Torphy et al. [16] examined the diffusion of educational resources on Pinterest. Their results indicated that direct connection between teachers spurs resource curation. Other work has examined teachers’ social media. The interested reader can refer to [8] for a survey on how to incorporate online social media in educational research.

3 Automated Teacher Identification

Dataset. We surveyed 541 PK-12 teachers across 5 states, 48 districts, and 99 schools. 432 teachers are females, 13 males, and 69 unspecified. For all teachers in our dataset, we acquired their followers and followees (their connections) which resulted in a network with 89,190 nodes (Pinterest users) and 4,379,592 links. Also, for all 89,190 users, we collected their Pinterest data i.e., their shared pins.

Fig. 1.
figure 1

Network of surveyed teachers on Pinterest where colors represent districts

Top-Down Teacher Identification. As mentioned before, the bottom-up data collection where we first survey teachers offline and then project them into online social media may not properly capture the representativeness of teachers in that online space. To grasp the idea, we visualize the network of our surveyed teachers in Fig. 1. First, we can see there is a considerable number of teachers without any connection to others (96 teachers or around 17% of our surveyed teachers). This is an undesirable property as we expect teachers to connect to their peers and engage in professional career development activities e.g., sharing resources. Second, in general, the network consists of several disjointed sub-networks (components). This disrupts the diffusion of information amongst teachers on Pinterest which plays an essential role in improving the quality of teaching [16]. Third, around 95% of teachers are connected to their peers in the same district, which defies the main strength of social media i.e., breaking physical constraints. Hence, we conclude that we should adamantly attempt to obtain a better picture of the network of teachers on Pinterest as explained in the following.

Fig. 2.
figure 2

Number of TPT pins vs the number of users

In line with previous studies [2, 13], we discovered that the predominant source of educational resources among teachers is TeachersPayTeachers.com ( hereafter referred to as TPT). There are several reasons behind this. First, TPT is the largest marketplace of educational resources offering millions of high-quality PK-12 educational resources. Second, image-oriented characteristics of TPT resources and image-based nature of Pinterest perfectly match these two platforms. Finally, quite often content producers in TPT are teachers/educators who join Pinterest and advertise/share their resources [2]. We also discovered that TPT is the dominant source of resources shared by our surveyed teachers comprising around 50% of the top 5 pin domains. Hence, we hypothesize that the existence of TPT pins in an account is a strong indication that the account belongs to a teacher/educator.

With the above discussion in mind, for all 88,649 other users in our dataset, we process their pins and if for a user the number of his/her TPT pins is more than a threshold K, we mark that user as a teacher. Figure 2 shows the number of users whose K pins’ domain is TPT where K changes from 1 to 200. We set K to 100 through which we can mark more than 12,000 users as likely teachers which is almost 23 times larger than the number of surveyed teachers. Note that not necessarily all those marked users are school teachers since they can be other types of educators such as educational organizations, home teachers, parents, and so on. However, as long as their footprint on Pinterest is concerned, they are similar to our surveyed teachers and we keep referring them as teachers.

Fig. 3.
figure 3

Evaluating the automated teacher identification

Evaluation. The evaluation process of our automated teacher identification is demonstrated in Fig. 3. First, we use the entire constructed network of Pinterest users (89,190 nodes and 4,379,592 links), and extract some features for nodes in an unsupervised manner. Feature extraction from a network is an effective approach and used in different applications [5, 9, 10, 18]. In this paper, we adopt the method proposed by Tang et al. [15] known as LINE (Large-scale information network embedding). The size of the representation for each node is 64. Second, on top of learned node representations, we carry out two classifications using Random Forest as the classifier. Both classifiers are tested against 100 surveyed teachers and 100 non-teachers. The first classifier is trained on the rest of 441 surveyed teachers and 441 identified non-teachers. The second classifier is trained on 5500 teachers and 5500 non-teachers (identified using our heuristic). For non-teacher samples, we include those having no TPT pin and no educational pin where being educational is marked by the Pinterest internal pin labeling system. The accuracy of the first classifier is just 58% while the second one achieves 76%. Hence, we can conclude that our heuristic for automated teacher identification is reliable as it significantly improves the performance of the teacher classification problem.

In the future, we plan to compare the two datasets, i.e., surveyed teachers and the augmented version, from the perspective of structural properties of the two networks as well as behavioral attributes of teachers. Further, we intend to make sense of the diffusion of information among teachers and characterize resources through their diffusion.