ag_ci

Embed Size (px)

Citation preview

  • 5/21/2018 ag_ci

    1/56

    IBM Social Media AnalyticsVersion 1.3.0

    Administration Guide

  • 5/21/2018 ag_ci

    2/56

    NoteBefore using this information and the product it supports, read the information in Notices on page 45.

    Product Information

    This document applies to IBM Social Media Analytics Version 1.3.0 and may also apply to subsequent releases.

    Licensed Materials - Property of IBM

    Copyright IBM Corporation 2010, 2014.US Government Users Restricted Rights Use, duplication or disclosure restricted by GSA ADP Schedule Contractwith IBM Corp.

  • 5/21/2018 ag_ci

    3/56

    Contents

    Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

    Chapter 1. What's new? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Changed features in 1.3.0.0.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    Simpler BoardReader license key management . . . . . . . . . . . . . . . . . . . . . . . 1New features in version 1.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    Start, restart, or stop individual services or web applications . . . . . . . . . . . . . . . . . . 1Changed features in version 1.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    Changes to importing and exporting hotwords . . . . . . . . . . . . . . . . . . . . . . . 1New features in version 1.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    Influencer reports configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Import and export of project configurations . . . . . . . . . . . . . . . . . . . . . . . . 2Document limiter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Disk space management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Ability to delete specific analysis data . . . . . . . . . . . . . . . . . . . . . . . . . . 3Ability to regenerate API keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    Chapter 2. Overview of Social Media Analytics . . . . . . . . . . . . . . . . . . . 5

    Chapter 3. Export and import project configuration . . . . . . . . . . . . . . . . . 7Exporting a project configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Importing a project configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    Chapter 4. Influencer reports configuration . . . . . . . . . . . . . . . . . . . . 11Configuring the influencer configuration file . . . . . . . . . . . . . . . . . . . . . . . . 11Updating the influencer configuration file . . . . . . . . . . . . . . . . . . . . . . . . . 15

    Chapter 5. Social Media Analytics services and web applications . . . . . . . . . . 17

    Starting Social Media Analytics services and web applications . . . . . . . . . . . . . . . . . . 18Restarting Social Media Analytics services and web applications. . . . . . . . . . . . . . . . . . 19Stopping Social Media Analytics services and web applications . . . . . . . . . . . . . . . . . . 20Deleting analysis data for specific authors or web addresses . . . . . . . . . . . . . . . . . . . 21Regenerating the public and private API keys . . . . . . . . . . . . . . . . . . . . . . . . 23Viewing the limit for downloadable documents from BoardReader . . . . . . . . . . . . . . . . . 23Changing the limit for downloadable documents from BoardReader . . . . . . . . . . . . . . . . 24

    Chapter 6. Disk space management . . . . . . . . . . . . . . . . . . . . . . . 27Checking available disk space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27Freeing up space on the Hadoop shared disk . . . . . . . . . . . . . . . . . . . . . . . . 28Freeing up disk space on the data node server . . . . . . . . . . . . . . . . . . . . . . . 28

    Chapter 7. BoardReader license management . . . . . . . . . . . . . . . . . . . 31

    Updating the BoardReader license key . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    Chapter 8. The Data Fetcher Development Kit . . . . . . . . . . . . . . . . . . . 33

    Appendix A. Accessibility features . . . . . . . . . . . . . . . . . . . . . . . . 35Keyboard shortcuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    Appendix B. Troubleshooting and support for IBM Social Media Analytics . . . . . . 37Troubleshooting checklist for IBM Social Media Analytics . . . . . . . . . . . . . . . . . . . . 37Troubleshooting resources for IBM Social Media Analytics . . . . . . . . . . . . . . . . . . . . 38

    Support Portal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

    Copyright IBM Corp. 2010, 2014 iii

  • 5/21/2018 ag_ci

    4/56

    Information gathering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38Service requests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38Business Analytics Client Center . . . . . . . . . . . . . . . . . . . . . . . . . . . 38Fix Central . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39Knowledge bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

    Collecting logging information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39Jobs fail with error no disk space is available . . . . . . . . . . . . . . . . . . . . . . . . 40Minimizing out of memory errors when running multiple projects simultaneously . . . . . . . . . . . 40

    Changing the buffer size for read and write operations . . . . . . . . . . . . . . . . . . . . 40Minimizing out of memory errors when fetching a large number of documents . . . . . . . . . . . . 41Increasing the heap size for WebSphere Embedded Application Server in large deployments . . . . . . . . 41Temporary directory not found errors when IBM Hadoop master node mounts external NFS server. . . . . . 42Resolving an error when you create a project . . . . . . . . . . . . . . . . . . . . . . . . 43

    Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

    Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

    iv IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    5/56

    Introduction

    This administration guide is intended for use in the administration of IBM SocialMedia Analytics.

    This guide includes procedures about how to manage Social Media Analyticsservices, back up and restore configuration files, and tune performance.

    Audience

    This administration guide is intended for Social Media Analytics systemadministrators.

    Finding information

    To find Social Media Analytics product documentation on the web, access the IBM

    Knowledge Center (http://www.ibm.com/support/knowledgecenter/SSJHE9/welcome). Release Notes are published directly to the Knowledge Center, andinclude links to the latest technotes and APARs.

    You can also read PDF versions of the product release notes and installation guidesdirectly from IBM product disks.

    Accessibility features

    Accessibility features help users who have a physical disability, such as restrictedmobility or limited vision, to use information technology products. Social MediaAnalytics has accessibility features. For information about these features, seeAppendix A, Accessibility features, on page 35.

    IBM HTML documentation has accessibility features. PDF documents aresupplemental and, as such, include no added accessibility features.

    Forward-looking statements

    This documentation describes the current functionality of the product. Referencesto items that are not currently available may be included. No implication of anyfuture availability should be inferred. Any such references are not a commitment,promise, or legal obligation to deliver any material, code, or functionality. Thedevelopment, release, and timing of features or functionality remain at the solediscretion of IBM.

    Samples disclaimer

    The Sample Outdoors Company, GO Sales, and any variation of the SampleOutdoors name depict fictitious business operations with sample data used todevelop sample applications for IBM and IBM customers. These fictitious recordsinclude sample data for sales transactions, product distribution, finance, andhuman resources. Any resemblance to actual names, addresses, contact numbers, ortransaction values is coincidental. Other sample files may contain fictional datamanually or machine generated, factual data compiled from academic or publicsources, or data used with permission of the copyright holder, for use as sampledata to develop sample applications. Product names referenced may be the

    Copyright IBM Corp. 2010, 2014 v

    http://www.ibm.com/support/knowledgecenter/SSJHE9/welcomehttp://www.ibm.com/support/knowledgecenter/SSJHE9/welcomehttp://www.ibm.com/support/knowledgecenter/SSJHE9/welcomehttp://www.ibm.com/support/knowledgecenter/SSJHE9/welcome
  • 5/21/2018 ag_ci

    6/56

    trademarks of their respective owners. Unauthorized duplication is prohibited.

    vi IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    7/56

    Chapter 1. What's new?

    This section contains a list of new features that affect the administration of IBMSocial Media Analytics for this release.

    Changed features in 1.3.0.0.3

    Some features have changed in this release.

    Simpler BoardReader license key managementTo be able to retrieve data from BoardReader, IBM Social Media Analytics must beupdated with any BoardReader license changes. Previously, updating theBoardReader license key in Social Media Analytics was a multiple step process.Now, a license update command makes it easy to do.

    For more information, seeUpdating the BoardReader license key on page 31.

    New features in version 1.3

    This release contains new features.

    Start, restart, or stop individual services or web applicationsPreviously, commands existed to start, restart, or stop Social Media Analyticsservices or web applications all at once. Now, you can control individual servicesand web applications. This is especially useful when you need to control a specificservice or web application without impacting in-progress jobs.

    Related concepts:

    Chapter 5, Social Media Analytics services and web applications, on page 17You can manage various services and web applications that run on the IBM SocialMedia Analytics servers.

    Related tasks:

    Starting Social Media Analytics services and web applications on page 18You can start Social Media Analytics services and web applications individually orall together.

    Restarting Social Media Analytics services and web applications on page 19You can restart Social Media Analytics services and web applications individuallyor all together.

    Stopping Social Media Analytics services and web applications on page 20You can stop Social Media Analytics services and web applications individually or

    all together.

    Changed features in version 1.3

    In this release, changes have been made to some features.

    Changes to importing and exporting hotwordsBecause hotwords are no longer used for retrieval and analysis of data, they are nolonger supported when exporting project configurations.

    Copyright IBM Corp. 2010, 2014 1

  • 5/21/2018 ag_ci

    8/56

    If you import a configuration that contains hotwords into version 1.3 and youchoose to import themes and concepts, a new theme is created, called Hotwords.The Hotwords theme contains the imported hotwords as concepts.

    For more information about the IBM Social Media Analytics configurationinterface, see the Social Media Analytics User Guide.

    Related concepts:

    Chapter 3, Export and import project configuration, on page 7You can export and import the configuration data in a project by doing an exportor import from the management console. You can use export and import as part ofyour backup and restore procedures.

    New features in version 1.2

    This release contains new features.

    Influencer reports configurationThere are new influencer reports in Reporting. The influencer reports contain datathat is provided by a third-party influence score provider. Before you can seeinfluencer data in the influencer reports, you must update the configuration filewith information about your influence score providers.

    Related concepts:

    Chapter 4, Influencer reports configuration, on page 11The influencer reports contain data that is provided by a third-party influencescore provider. Before you can see influencer data in the influencer reports, youmust update the configuration file with information about your influence scoreproviders.

    Import and export of project configurationsYou can import and export the configuration data for a project through the

    management console.Related tasks:

    Exporting a project configuration on page 7You can export the configuration data for a project by using the managementconsole from the command line.

    Importing a project configuration on page 8You can import the configuration data for a project by using the managementconsole from the command line.

    Document limiterYour license with IBM limits the number of documents that you can retrieve fromBoardReader each month. You can change this limit by using commands in the

    management console. You can configure IBM Social Media Analytics to enforce thedocument to prevent a job from running if it exceeds your monthly documentlimit.

    Disk space managementYou can check available disk space and free up disk space on the Hadoop masterand slave nodes and on the data node.

    Related tasks:

    Checking available disk space on page 27You can see how much disk space is used on the Hadoop master and slave nodes.

    2 IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    9/56

    Freeing up space on the Hadoop shared disk on page 28You can free up disk space on the Hadoop master and slave nodes.

    Freeing up disk space on the data node server on page 28You can free up disk space on the data node server by performing a clean upoperation on a number of different directories.

    Ability to delete specific analysis dataYou can delete content for specific authors or web addresses from the analyzeddata that is stored in IBM Social Media Analytics databases.

    Related tasks:

    Deleting analysis data for specific authors or web addresses on page 21You can delete content for specific authors or web addresses from the analyzeddata that is stored in IBM Social Media Analytics databases.

    Ability to regenerate API keysYou can now generate and publish the set of public and private API keys.

    Related tasks:

    Regenerating the public and private API keys on page 23The set of public and private keys that is used to sign the IBM Social MediaAnalytics APIs is generated and published as part of the installation. If necessary,you can regenerate and republish a new set of keys.

    Chapter 1. What's new? 3

  • 5/21/2018 ag_ci

    10/56

    4 IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    11/56

    Chapter 2. Overview of Social Media Analytics

    IBM Social Media Analytics is a business analysis application that helps yourorganization analyze content in social media.

    Social Media Analytics helps you gain insight into social media discussions that arerelated to your key focus for analysis. It helps your organization answer thefollowing types of questions:

    v What are consumers saying and hearing about my brand?

    v What are the most talked about product attributes in my product category? Isthe feedback good or bad?

    v What is the competition doing to excite the market?

    v Is my employer reputation affecting my ability to recruit top talent?

    v What are the reputations of the new vendors that I am considering?

    v What issues are most important for my constituency?

    The Welcome page

    You access projects from the Welcome page. For each project, you can go toReporting, Analysis, and Configuration.

    IBM Social Media Analytics system administrators can create and delete projects.Both system administrators and project administrators can import and export theconfiguration data for a project.

    InReporting, you can quickly assess social media results and pinpoint areas forfurther analysis by using predefined reports that are provided with Social MediaAnalytics. You can also explore further analysis options by using IBM Cognos

    Business Intelligence.

    InAnalysis, you explore results in detail by slicing and dicing the content. You cananalyze snippets by searching, filtering, and drilling down into them.

    In Configuration, you define terms for themes and concepts to extract social mediafrom blogs, discussion forums and message boards, Twitter, news sites, reviewsites, and video sites. The terms that you define are based on your businessobjective for analyzing social media. These rules are used to extract sections of textfrom the set of documents. The sections of text that are known as snippets, are theparts of a document that are relevant to the area that you want to analyze.

    The Current usageprogress bar shows the number of documents that have beenretrieved out of your monthly limit. The value is displayed as a percentage. TheCurrent usage progress bar is also displayed in Configuration.

    Access to projects and management of projects is determined by the roles that youruser ID is assigned to. The projects and functionality that you see on the Welcomepage might be different from the projects and functionality that your colleagues see

    because your user IDs can be assigned to different roles. For more informationabout users and roles, see the IBM Social Media Analytics User Guide.

    The Welcome page contains the following links:

    Copyright IBM Corp. 2010, 2014 5

  • 5/21/2018 ag_ci

    12/56

    v Help link to the product documentation.

    v Getting Started link that displays The Welcome page topic.

    v How To Videoslink to a series of short videos that demonstrate Social MediaAnalytics functionality.

    Social Media Analytics key terms

    To work with Social Media Analytics effectively, you should understand thefollowing terms:

    Use caseA specific scenario that you want to analyze. For example, in Social MediaAnalytics you can analyze features of a product, analyze topics overdifferent time periods, and allow different groups of people to access ananalysis. Each of these scenarios is a use case. You configure a use case inConfiguration, and review and analyze the results in Reporting andAnalysis.

    ProjectThe environment in Social Media Analytics where use cases are configured,

    reviewed, and analyzed. You can create one project for each use case,which helps you to manage each use case separately. You can also re-useprojects for different use cases by changing the configuration.

    SourceThe type of media site that a document comes from. In Social MediaAnalytics, sources are blogs, discussion forums and message boards,Twitter, news sites, review sites, and video sites. For example, the sourcefor Facebook documents is message boards.

    SnippetA segment of text that is relevant to your analysis. A snippet is identified

    by using concept definitions.

    ConceptA subject, topic, or idea to search for in social media that is relevant for aspecific use case. For example, if you are doing competitive analysis, youcan define a concept for the competitor name or the product name of acompetitor. Social Media Analytics extracts concepts from the blogs,discussion forums and message boards, Twitter, news sites, review sites,and video sites.

    ThemeThe theme is a central subject for analysis, and contains related concepts.Projects contain one or more themes.

    Sentiment termA word or words that express the tone of a sentiment. Social Media

    Analytics applies linguistic rules to sentiment terms and creates sentimentphrases in a snippet. These phrases are used to determine the overallsentiment of the snippet. The sentiment can be positive, negative, neutral,or ambivalent.

    Media setA group of web addresses that represents a specific category of website.

    Area of analysisIn Reporting, the reports are organized into three areas of analysis: SocialMedia Impact, Segmentation, and Discovery. Each area of analysis has a setof reports that enables sophisticated reporting and analysis.

    6 IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    13/56

    Chapter 3. Export and import project configuration

    You can export and import the configuration data in a project by doing an exportor import from the management console. You can use export and import as part ofyour backup and restore procedures.

    Types of export and import

    You can export and import the queries in a project. You can also export and importany combination of the following analysis rules:

    v sentiment terms

    v themes and concepts

    v media sets

    You can also perform export and import from the Welcome page. For moreinformation, see the IBM Social Media Analytics User Guide.

    Exporting a project configuration

    You can export the configuration data for a project by using the managementconsole from the command line.

    Before you begin

    You must know the value of the internal project ID for the project to be exported.Find the value by searching the cci_topology.xmlfile fortype="Project"whereproperty name="contextRoot" and value=project ID as displayed on theWelcome page. The internal project ID is the value where property

    name="projectID". In the following example excerpt from the cci_topology.xmlfile, the project ID as displayed on the Welcome page is myproject and theinternal project ID is pro00001:

    About this task

    You can export one project configuration at a time.

    Ensure that users are not working in Configuration when you perform this task.

    The exported project configuration is stored in a JavaScript Object Notation (.json)file in a folder that is named with the project ID. The default location of the folderis stored in thecci_topology.xml file. You can change the default location byupdating thecci_topology.xml file. Look for the value where resource id is

    project-config-ui-1and property name is configFileDir for the user interface node. Formore information about the topology file, see the IBM Social Media AnalyticsInstallation and Configuration Guide.

    Log files for the export process are stored in install_location/cci_tmgmt/BackupRestoreUtility/logs.

    Copyright IBM Corp. 2010, 2014 7

  • 5/21/2018 ag_ci

    14/56

    Procedure

    1. Log in to the user interface node, as thecciusr user.

    2. Type the following command:

    ./cci_cli.sh -u status=complete -r post:/projects/internal_project_ID/exportConfiguration:{"file":"file_name","datafetcher":true|false,"sentiments":true|false,"typesconcepts":true|false,"mediasets":true|false}

    Where internal_project_IDis the internal project ID that you determined in theAbout this tasksection and file_nameis the name of the file to export to. You canalso specify an absolute or relative path with the file name. If you specify arelative path, the path is relative to the path that is stored in thecci_topology.xmlfile. Do not insert blanks in the text that occurs between the

    braces ({}). In the following example the queries, themes, and concepts for theproject with internal project ID pro00001are exported to/home/cciusr/Project1_backup.json.

    ./cci_cli.sh -u status=complete -r post:/projects/pro00001/exportConfiguration:{"file":"/home/cciusr/Project1_backup","datafetcher":true,"sentiments":false,"typesconcepts":true,"mediasets":false}

    Importing a project configuration

    You can import the configuration data for a project by using the managementconsole from the command line.

    Before you begin

    The project that you want to import in to must exist in IBM Social MediaAnalytics.

    You must know the value of the internal project ID for the project that you want toimport in to. Find the value by searching the cci_topology.xml file for

    type="Project" where property name="contextRoot" and value=project ID asdisplayed on the Welcome page. The internal project ID is the value whereproperty name="projectID". In the following example excerpt from thecci_topology.xml file, the project ID as displayed on the Welcome page ismyproject and the internal project ID is pro00001:

    About this task

    You can import one project configuration at a time.

    Ensure that users are not working in Configuration when you perform this task.

    Log files for the import process are stored in theinstall_location/cci_tmgmt/BackupRestoreUtility/logsdirectory.

    Procedure

    1. Log in to the user interface node, as thecciusr user.

    2. Type the following command:

    8 IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    15/56

    ./cci_cli.sh -u status=complete -r post:/projects/internal_project_ID/importConfiguration:{"file":"file_name","datafetcher":true|false,"sentiments":true|false,"typesconcepts":true|false,"mediasets":true|false}

    Whereinternal_project_IDis the internal project ID that you determined in theBefore you begin section and file_nameis the name of the file to import from. You

    can also specify an absolute or relative path with the file name. If you specify arelative path, the path is relative to the path that is stored in thecci_topology.xmlfile. Do not insert blanks in the text that occurs between the

    braces ({}). In the following example the queries, themes, and concepts from/home/cciusr/Project1_backup.jsonare imported in to a project namedpro00001.

    ./cci_cli.sh -u status=complete -r post:/projects/pro00001/importConfiguration:{"file":"/home/cciusr/Project1_backup","datafetcher":true,"sentiments":false,"typesconcepts":true,"mediasets":false}

    Chapter 3. Export and import project configuration 9

  • 5/21/2018 ag_ci

    16/56

    10 IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    17/56

    Chapter 4. Influencer reports configuration

    The influencer reports contain data that is provided by a third-party influencescore provider. Before you can see influencer data in the influencer reports, youmust update the configuration file with information about your influence scoreproviders.

    The influencer reports provide an analysis of authors that are based on influencescore or category. This information is derived by analyzing the influencer scoreprovided by an influence score provider.

    If you have a GNIP PowerTrack-enabled BoardReader license key, IBM SocialMedia Analytics automatically populates the following scores for Twitter authors:

    v The Klout score of the author.

    v The number of Twitter followers the author has.

    v The number of Twitter friends the author has.

    To have more influencer score data display in the influencer reports, you must dothe following actions:

    v Obtain a license from one or more influence score providers.

    v Configure theinfluence_score_vendor_configuration.jsonfile. For informationabout how to configure theinfluence_score_vendor_configuration.json file,seeConfiguring the influencer configuration file.

    After you configure and enable theinfluence_score_vendor_configuration.jsonfile, every ad hoc or scheduled job in your project references it. Processing of theinfluence scores occurs during the export phase of a job. The dataloader.log filecontains information about and errors that are related to processing of the

    influence scores. After a job runs, the influencer data displays in the influencerreports in Reporting.

    If you want to change the influence_score_vendor_configuration.jsonfile afteryou have run jobs, performUpdating the influencer configuration file on page15.

    Configuring the influencer configuration file

    Theinfluence_score_vendor_configuration.json file contains parameters thatdescribe the influence score provider API, values to retrieve, retrieval limitations,and refresh and purge requirements. Perform this task the first time that youconfigure the influencer configuration file.

    Before you begin

    You must know the license key for your influence score provider.

    About this task

    You can configure the following influence score providers:

    Klout Klout supports one type of score, called score.

    Copyright IBM Corp. 2010, 2014 11

  • 5/21/2018 ag_ci

    18/56

    Note: If you have a GNIP PowerTrack-enabled BoardReader license key,IBM Social Media Analytics populates the Klout score. Do not configureKlout separately in this case.

    BoardReaderIf you have a GNIP PowerTrack-enabled BoardReader license key, you canget Klout scores through the BoardReader API.

    Note: You can configure the file to get Klout scores directly from Klout orthrough BoardReader, but not both.

    Other third-party providerAn influence score provider other than Klout.

    Theinfluence_score_vendor_configuration.jsonfile and theinfluence_score_vendor_configuration.json.templatefile are stored on the datanode. The location of these files depends on the path that is chosen when youinstall the product. The default location is /local/cci/prod/dls/services/Dataloader/conf.

    Theinfluence_score_vendor_configuration.json file contains default entries forKlout. Theinfluence_score_vendor_configuration.json.templatefile containssample entries for BoardReader and other third-party influence score providers.You can copy entries from theinfluence_score_vendor_configuration.json.templatefile, paste them in to theinfluence_score_vendor_configuration.json file, and modify them as required.

    If you are working on a Linux operating system, you can use the vi editor or othertext editor to edit the file. If you are working on a Microsoft Windows operatingsystem, ensure that there are no CTRL-M characters in the file before you copy it

    back to your Linux operating system.

    After you complete this task, every ad hoc or scheduled job in your project will

    reference theinfluence_score_vendor_configuration.json. If you want to updatethe file, seeUpdating the influencer configuration file on page 15.

    The following list explains the parameters in the configuration file. For Klout, youmust configure author_id_urland influence_url. Consider using the defaultvalues initially and then modifying them after you run a job and see the results inthe influencer reports:

    api_id Identifies the API of the influence score provider.

    document_source, site_url, media_setInfluence score provider APIs provide scores for authors for specificdocument sources (for example, Twitter), social media sites (for example,twitter.com), or media sets (sets of sites). Use the document_source,

    site_url, and media_set parameters to specify the content sources that areapplicable for this API. You must specify at least one of these parameters.Social Media Analytics calls the influence score provider API with authorsfrom only content that matches the parameters.

    author_nicknameThe column name in the IBM DB2 database Author dimension that isused as the author name for the influence score provider API. Do notchange this parameter.

    author_id_url

    12 IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    19/56

    The web address that provides the value for the ID of an author,AUTHOR_ID. The web address must include authentication tokens if they arerequired. The web address can contain a reference to author_nickname,which Social Media Analytics substitutes for the actual value. Replace@api_key@with your API license key.

    This parameter is required for Klout. Klout has a two-step process for

    providing influencer scores. The first step retrieves the identity of theauthor by using this parameter. The second step uses the influence_urltoget the influence score for the author. For API providers that do notsupport this two-step process, exclude this parameter from the file.

    influence_url

    The web address of the influence score provider that provides theinfluencer data. The web address must include authentication tokens ifthey are required. The web address can contain a reference toauthor_nickname, which Social Media Analytics substitutes for the actualvalue. Replace@api_key@ with your license key.

    maximum_requests

    The maximum number of authors that the API allows in a single retrieve.If default value of empty string is used, then scores for all authors thatmatch the filter parameters are retrieved.

    purge_intervalNumber of days after which the score is for an author. Some influencescore providers limit the number of days that a score can be kept. If yourinfluence score provider limits the number of days that a score can be kept,enter the value here. This value must be greater than or equal to 5 so thatSocial Media Analytics can download scores for a larger set of authors overtime. This value must be greater than the value of the Intervalfield in ascheduled job. If this value is less than the value ofInterval, the scores arenot deleted.

    refresh_interval

    Number of days after which the score is refreshed for an author. Therefresh date determines which authors to update when the job runs. If norefresh date is specified, authors are deleted based on the value ofpurge_interval.

    score_mappingsFor each score that you want to receive, configure one score_mappingssection. An influencer report contains a list in which you can select onescore to be displayed in the report. For each score_mappings section thatyou configure, there is one entry in the list in the report. You must defineat least one score_mappings section for each of your influence scoreproviders. You can define a maximum of 7 continuous scores and 14categorical scores.

    score_parameter

    The parameter name in the JSON response that holds the score.The value can be a JSON path in the case where the score is nestedin the response. For example, in the following JSON response, the

    JSON paths are data.influence and data.outreach:

    Chapter 4. Influencer reports configuration 13

  • 5/21/2018 ag_ci

    20/56

    "data": [{"influence": 773,"name": "PeopleBrowsr","outreach": 7}

    score_type

    Valid values are continuous and categorical. A continuous scorecan be used to show authors with the highest X scores. Categoricalscores can be put into discrete categories or bins to show influencescore distribution across all authors.

    display_nameName of the influence score that is displayed in the reports. Thename must be unique in this file.

    binningThis parameter specifies the binning strategy to be applied forcontinuous score. Set this parameter to none if you want to use thecontinuous score as is. If you want to convert a continuous scoreinto a categorical set this parameter to fixedWidth.

    binning_valueRangeIfbinning = fixedWidth, this parameter specifies the value rangefor the result parameter. For example, Klout scores have a range of0 - 100. Therefore, set this parameter to [0 - 100].

    binning_numberOfBinsIfbinning = fixedWidth, this parameter specifies the number of

    bins to use. For example, if the value ofbinning_valueRange is [0- 100], and you set the value ofbinning_numberOfBinsto 5, thenthere are five bins with a bin size of 20.

    Procedure

    1.

    Open theinfluence_score_vendor_configuration.json

    file in a text editor.2. Change the value ofinfluence_score_enabled to true.

    Tip: To temporarily disable influence score data retrieval, set this value to false.

    3. Find the section in the file for your influence score provider. If you do not seeit, theinfluence_score_vendor_configuration.json.template file containssample entries that you can copy and paste in toinfluence_score_vendor_configuration.json.

    4. Update the required parameters for one or more influence score providers asdescribed in the following steps:

    a. For Klout, updateinfluence_url and author_id_url. For both parameters,replace @api_key@ with your Klout API license key.

    b. For a PowerTrack-enabled BoardReader license key, set bothinfluence_urlandauthor_id_urlto empty string. Set api_idto Klout.

    5. Leave the default values for the other parameters as they are defined or changethem for your business requirements.

    6. Save the file.

    14 IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    21/56

    Updating the influencer configuration file

    After you have configured and enabled theinfluence_score_vendor_configuration.jsonfile and run jobs that reference it,you might need to update it. For example, you want to add an influence scoreprovider. In this situation, there are some special considerations to be aware of.

    About this task

    Theinfluence_score_vendor_configuration.jsonfile and theinfluence_score_vendor_configuration.json.templatefile are stored on the datanode. The location of these files depends on the path that is chosen when youinstall the product. The default location is /local/cci/prod/dls/services/Dataloader/conf.

    If you are working on a Linux operating system, you can use the vi editor or othertext editor to edit the file. If you are working on a Microsoft Windows operatingsystem, ensure that there are no CTRL-M characters in the file before you copy it

    back to your Linux operating system.

    For detailed information about the parameters in the influencer configuration file,seeConfiguring the influencer configuration file on page 11.

    Procedure

    1. Edit theinfluence_score_vendor_configuration.json.

    2. Ensure that the value ofinfluence_score_enabled is true.

    Tip: To temporarily disable influence score data retrieval, set this value to false.

    3. Make your changes and save the file.

    4. If you made any of the following changes, you must re-create the scheduledjobs that existed before you updated the file:

    v Added or deleted an influence score provider.v Added, deleted, or changed ascore_mappings section.

    Chapter 4. Influencer reports configuration 15

  • 5/21/2018 ag_ci

    22/56

    16 IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    23/56

    Chapter 5. Social Media Analytics services and webapplications

    You can manage various services and web applications that run on the IBM SocialMedia Analytics servers.

    The Social Media Analytics architecture uses several server nodes; the server nodesdo not necessarily map to physical computers. Each node has a special purposeand runs its own services and applications.

    The following table shows the Social Media Analytics services on each node.

    Table 1. Social Media Analytics services and their resource IDs

    Server node Service Resource ID

    User interfacenode

    WebSphere Application Server was-ui-1

    Apache Web Server apacheweb-ui-1

    Data node WebSphere Application Server was-ds-1

    DB2 db2-ds-1

    Data Loader Service dataloader-ds-1

    Open LDAP ldap-ds-1

    Documentlimiter node

    WebSphere Application Server was-dl-1

    DB2 db2-dl-1

    Hadoop masternode

    IBM Hadoop hadoop-hm-1

    Flow Manager flowmanager-hm-1

    The following table shows the Social Media Analytics web applications on eachnode.

    Table 2. Social Media Analytics web applications and their resource IDs

    Server node Web application Resource ID

    User interfacenode

    Cognos BI web application cognos-ui-1

    Social Media Analytics REST APIweb application

    sma-rest-api-ui-1

    Social Media Analytics UI webapplication

    sma-application-ui-1

    Social Media AnalyticsManagement web application

    ssma-mgmt-ui-1

    Social Media Analytics WelcomePage web application

    landing-page-ui-1

    Administration UI webapplication

    adminapp-was-ui-1 (one instance perproject)

    Analysis UI web application analysisui-was-ui-1 (one instanceper project)

    Copyright IBM Corp. 2010, 2014 17

  • 5/21/2018 ag_ci

    24/56

    Table 2. Social Media Analytics web applications and their resource IDs (continued)

    Server node Web application Resource ID

    Data node Analysis DAS web application analysisdas-was-ds-1-pro00001 (oneinstance per project)

    Documentlimiter node

    Document limiter webapplication

    dlapp-dl-1

    Starting Social Media Analytics services and web applications

    You can start Social Media Analytics services and web applications individually orall together.

    Before you begin

    Before you perform this task, do the following actions:

    v Log on to the user interface node as the cciusruser.

    v Go to thesma_location/cci_installmgr/cci_mngmt/cci_cli/directory, and

    check the status of Social Media Analytics services and web applications bytyping one of the commands from the following table.

    Table 3. Commands for checking the status of Social Media Analytics services and web

    applications

    Action Command

    Show the status of an individual service.

    Replace resource_id with the resource IDof the service whose status you want tosee.

    ./cci_cli.sh -process status resourceIdresource_id

    Show the status of an individual webapplication.

    Replace resource_id with the resource IDof the web application whose status youwant to see.

    ./cci_cli.sh -process statusWebApplicationsresourceId resource_id

    Show the status of all services. ./cci_cli.sh -process status

    Show the status of all web applications. ./cci_cli.sh -process statusWebApplications

    Show the status of all services and webapplications.

    ./cci_cli.sh -process statusAll

    Services are listed with a status of RUNNING, STOPPED, UNKNOWN, orUNAVAILABLE. Services with a RUNNING status are unaffected by startcommands. Services with a STOPPED, UNKNOWN, or UNAVAILABLE status

    might not start if they are dependent upon another service that is not running.Starting a web application service also starts its associated web applications.

    18 IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    25/56

    Procedure

    Type one of the commands from the following table.

    Table 4. Commands for starting Social Media Analytics services and web applications

    Action Command

    Start an individual service.

    Replace resource_id with the resource IDof the service to be started.

    ./cci_cli.sh -process start resourceId

    resource_id

    Start an individual web application.

    Replace resource_id with the resource IDof the web application to be started.

    ./cci_cli.sh -process startWebApplicationsresourceId resource_id

    Start all services. ./cci_cli.sh -process start

    Start all web applications. ./cci_cli.sh -process startWebApplications

    Start all services and web applications. ./cci_cli.sh -process startAll

    Related concepts:

    Chapter 5, Social Media Analytics services and web applications, on page 17You can manage various services and web applications that run on the IBM SocialMedia Analytics servers.

    Restarting Social Media Analytics services and web applications

    You can restart Social Media Analytics services and web applications individuallyor all together.

    Before you begin

    Before you perform this task, do the following actions:

    v Log on to the user interface node as the cciusr user.v Go to thesma_location/cci_installmgr/cci_mngmt/cci_cli/directory, and

    check the status of Social Media Analytics services and web applications bytyping one of the commands from the following table.

    Table 5. Commands for checking the status of Social Media Analytics services and web

    applications

    Action Command

    Show the status of an individual service.

    Replace resource_id with the resource IDof the service whose status you want tosee.

    ./cci_cli.sh -process status resourceIdresource_id

    Show the status of an individual webapplication.

    Replace resource_id with the resource IDof the web application whose status youwant to see.

    ./cci_cli.sh -process statusWebApplicationsresourceId resource_id

    Show the status of all services. ./cci_cli.sh -process status

    Show the status of all web applications. ./cci_cli.sh -process statusWebApplications

    Chapter 5. Social Media Analytics services and web applications 19

  • 5/21/2018 ag_ci

    26/56

    Table 5. Commands for checking the status of Social Media Analytics services and web

    applications (continued)

    Action Command

    Show the status of all services and webapplications.

    ./cci_cli.sh -process statusAll

    Services are listed with a status of RUNNING, STOPPED, UNKNOWN, orUNAVAILABLE. Services with a STOPPED, UNKNOWN, or UNAVAILABLEstatus might not restart if they are dependent upon another service that is notrunning. Restarting a web application service also restarts its associated webapplications. Restarting services or web applications might cause in-progress jobsto fail.

    Procedure

    Type one of the commands from the following table.

    Table 6. Commands for restarting Social Media Analytics services and web applications

    Action Command

    Stop and restart an individual service.

    Replace resource_id with the resource IDof the service to be restarted.

    ./cci_cli.sh -process restart resourceIdresource_id

    Stop and restart an individual webapplication.

    Replace resource_id with the resource IDof the web application to be restarted.

    ./cci_cli.sh -processrestartWebApplications resourceIdresource_id

    Stop and restart all services. ./cci_cli.sh -process restart

    Stop and restart all web applications. ./cci_cli.sh -processrestartWebApplications

    Stop and restart all services and webapplications.

    ./cci_cli.sh -process restartAll

    Related concepts:

    Chapter 5, Social Media Analytics services and web applications, on page 17You can manage various services and web applications that run on the IBM SocialMedia Analytics servers.

    Stopping Social Media Analytics services and web applications

    You can stop Social Media Analytics services and web applications individually orall together.

    Before you begin

    Before you perform this task, do the following actions:

    v Log on to the user interface node as the cciusruser.

    v Go to thesma_location/cci_installmgr/cci_mngmt/cci_cli/directory, andcheck the status of Social Media Analytics services and web applications bytyping one of the commands from the following table.

    20 IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    27/56

    Table 7. Commands for checking the status of Social Media Analytics services and web

    applications

    Action Command

    Show the status of an individual service.

    Replace resource_id with the resource IDof the service whose status you want tosee.

    ./cci_cli.sh -process status resourceIdresource_id

    Show the status of an individual webapplication.

    Replace resource_id with the resource IDof the web application whose status youwant to see.

    ./cci_cli.sh -process statusWebApplicationsresourceId resource_id

    Show the status of all services. ./cci_cli.sh -process status

    Show the status of all web applications. ./cci_cli.sh -process statusWebApplications

    Show the status of all services and webapplications.

    ./cci_cli.sh -process statusAll

    Services are listed with a status of RUNNING, STOPPED, UNKNOWN, orUNAVAILABLE. Services with a STOPPED status are unaffected by stopcommands. Stopping a web application service also stops its associated webapplications. Stopping services or web applications might cause in-progress jobs tofail.

    Procedure

    Type one of the commands from the following table.

    Table 8. Commands for stopping Social Media Analytics services and web applications

    Action Command

    Stop an individual service.

    Replace resource_id with the resource IDof the service to be stopped.

    ./cci_cli.sh -process stop resourceIdresource_id

    Stop an individual web application.

    Replace resource_id with the resource IDof the web application to be stopped.

    ./cci_cli.sh -process stopWebApplicationsresourceId resource_id

    Stop all services. ./cci_cli.sh -process stop

    Stop all web applications. ./cci_cli.sh -process stopWebApplications

    Stop all services and web applications. ./cci_cli.sh -process stopAll

    Related concepts:Chapter 5, Social Media Analytics services and web applications, on page 17You can manage various services and web applications that run on the IBM SocialMedia Analytics servers.

    Deleting analysis data for specific authors or web addresses

    You can delete content for specific authors or web addresses from the analyzeddata that is stored in IBM Social Media Analytics databases.

    Chapter 5. Social Media Analytics services and web applications 21

  • 5/21/2018 ag_ci

    28/56

    Before you begin

    Before you perform this task, ensure that no jobs are running in the export phase.The export phase of a job loads analysis data into the databases, which conflictswith the deletion that occurs in this task.

    About this task

    On the data node, there is a JavaScript Object Notation (.json) file calledblacklistInfo.json.templatethat is located at /local/cci/prod/dls/services/Dataloader/conf. Use this file as a template to create a list of authors and webaddresses to be deleted. The following example illustrates how to specify authorsand web addresses in the .json file:

    {"blacklist":

    {"Url" : [ "http://web_address1.com", "http://web_address2.com" ],"Author" : [ "http://web_address3.com/author", "http://web_address4.com/author" ]

    }}

    When you specify a web address, all data that is associated with the domain andsubdomain of the web address is deleted. When you specify an author, all datathat is related to the author is deleted. There is a database for Reporting and adatabase for Analysis. The data is removed for all projects in these databases. Afterthe data is deleted, it does not appear in Reporting or Analysis.

    After you complete this task, you can add the command to a crontab job. Addingthe command to a crontab job ensures that the deletion occurs automatically andregularly for any new data that is added to the databases. Ensure that thecommand runs only when no jobs are running the export phase.

    Procedure

    1. Make a copy of theblacklistInfo.json.template file and save it on the userinterface node. Give the file a unique name. For example,myblacklistInfo.json.

    2. OpenmyblacklistInfo.json in a text editor.

    3. Add web addresses and authors as explained in the About this tasksection ofthis task, and save the file.

    4. Log on to the user interface node as the cciusruser.

    5. Go to thesma_location/cci_installmgr/cci_mngmt/cci_cli/directory.

    6. Type the following command:

    ./cci_cli.sh -process blacklist blacklistFile path_to_myblacklistInfo.json

    Where path_to_myblacklistInfo.jsonis the fully qualified path and name of the file

    that contains the web addresses and authors to be deleted.7. When the command completes, go to the dataloader node and look at the

    dataloader.logfile. If the deletion is successful, then go to the next step. If thedeletion was unsuccessful, thedataloader.log file contains error messages thathelp you to troubleshoot the problem. Rerun the ./cci_cli.shcommand ifnecessary.

    8. Log out of Social Media Analytics, and clear your browser cache.

    9. Verify that the data is deleted by performing the following steps:

    22 IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    29/56

    a. Log on to Social Media Analytics. For a project, go to the Search Results tabin Analysis, and verify that the web addresses and authors that you deleteddo not appear in the list of snippets.

    b. For a project, go to Reporting, and view the top influencers page. Verifythat the web addresses and authors that you deleted do not appear in anyof the reports.

    Regenerating the public and private API keys

    The set of public and private keys that is used to sign the IBM Social MediaAnalytics APIs is generated and published as part of the installation. If necessary,you can regenerate and republish a new set of keys.

    About this task

    This process generates and publishes a set of public and private keys and refreshesthem on all server nodes that require them. The keys are stored in the installationlocation in a folder namedapikeys. The default installation location is/local/ibm/cognos/ci/coninsight/.

    Procedure

    1. Log on to the user interface node as the cciusr user.

    2. Type the following command:

    cci_cli.sh -process publishApiKeys

    Viewing the limit for downloadable documents from BoardReader

    You can see the maximum number of documents that can be retrieved fromBoardReader during a calendar month.

    About this task

    You can see the following fields:

    Limit The maximum number of documents that can be retrieved fromBoardReader during a calendar month. The default value is 12 million.

    Enforce LimitsA boolean flag that indicates whether a user can start a job that will exceedthe document limit. If this parameter is set to true, IBM Social MediaAnalytics prevents a job from being started if the job will exceed thedocument limit. The user must change the job options to lower the numberof documents it retrieves. If this parameter is set to false, a warningmessage displays when the job exceeds the document limit. The user canstill run the job.

    Warning ThresholdA value that determines the point at which the Current usage progress baron the Welcome page changes color to indicate that you are nearing yourdocument limit. This value represents the ratio of retrieved documents tothe monthly limit. It is expressed as a value between 0.00 and 1.00.

    The value in the Usage column shows the number of documents that have beenretrieved to date for the month for each project. The value in the Reserved Docscolumn shows the remaining number of documents that are estimated to beretrieved for a currently running job. This value is set by IBM Social Media

    Chapter 5. Social Media Analytics services and web applications 23

  • 5/21/2018 ag_ci

    30/56

    Analytics at the start of a job and is set to zero when the data fetcher phase of thejob finishes. Before a job starts, the total of the values in the Usagecolumn and thevalue in the Reserved Docs field is calculated. Depending on the value ofEnforceLimits, Social Media Analytics prevents the job from running or it displays awarning message and the job can still run.

    At the start of a calendar month, the values in the Usage column are set to zero.

    To change the values of these parameters, see Changing the limit fordownloadable documents from BoardReader.

    Procedure

    1. Log on to Social Media Analytics as a user with administrator or systemadministrator privileges.

    2. Enter the following web address in to the address of the web browser:

    http://ui_node_host_name/documentlimiter/admin.jsp

    Changing the limit for downloadable documents from BoardReader

    Your license with IBM limits the number of documents that you can retrieve fromBoardReader each month. The default limit is 12 million documents. If your licensespecifies a different number, you must change this limit by using commands in themanagement console.

    Procedure

    1. Log on to the user interface node server as thecciusr user.

    2. Go to the/cci_installmgr/cci_mngmt/cci_clidirectory.

    3. Run the following command:

    ./cci_cli.sh -process updateDocLimitDeployment monthlyLimit warningThreshold enforceLimits

    For example:

    ./cci_cli.sh -process updateDocLimitDeployment monthlyLimit 1000warningThreshold 0.90 enforceLimits true

    To change the values, you can specify one or more of the following parametersin the command. If you run the command with no parameters specified, thecurrent values are displayed:

    monthlyLimitThe maximum number of documents that can be retrieved fromBoardReader during a calendar month. The default value is 12 million.

    warningThresholdA value that determines the point at which the Current usage progress

    bar on the Welcome page changes color to indicate that you are nearing

    your document limit. This value represents the ratio of retrieveddocuments to the monthly limit. It is expressed as a value between 0.00and 1.00.

    enforceLimitsA boolean flag that indicates whether a user can start a job that willexceed the document limit. If this parameter is set totrue, IBM SocialMedia Analytics prevents a job from being started if the job will exceedthe document limit. The user must change the job options to lower the

    24 IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    31/56

    number of documents it retrieves. If this parameter is set to false, awarning message displays when the job exceeds the document limit.The user can still run the job.

    Chapter 5. Social Media Analytics services and web applications 25

  • 5/21/2018 ag_ci

    32/56

    26 IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    33/56

    Chapter 6. Disk space management

    IBM Social Media Analytics stores interim analysis results. Over time, this dataaccumulates and fills the disk. Clean up the disk space on the servers periodically.

    You can check the following areas for disk space usage and free up space:

    v Shared disk space that is mounted on the Hadoop master and slave nodes.

    v Local disk space on the data node.

    Checking available disk space

    You can see how much disk space is used on the Hadoop master and slave nodes.

    Procedure

    1. Log in to the Hadoop master node as thehadoop user.

    2. Find the location of the shared file system by looking in the$FLWMGR_HOME/toroBackend/flowmanager/scripts/FlowMgr.propertiesfile. The

    value of the CLUSTER_FILESYSTEM_PATHentry is the location of the shared filesystem.

    3. Type the following command to check the disk space usage:

    df -h

    4. In the output from the command, look for the line that contains the value ofthe CLUSTER_FILESYSTEM_PATHentry from theFlowMgr.propertiesfile. This lineshows the percentage of the mounted share file system that is used.

    In the following example output, the line that contains/mnt/hdgpfs is thepercentage of the mounted shared file system that is used:

    Filesystem Size Used Avail Use% Mounted on

    /dev/sda3 79G 9.3G 66G 13% /udev 16G 192K 16G 1% /dev/dev/sda1 190M 29M 152M 17% /boot/dev/sda4 292G 1.9G 275G 1% /local/dev/sdb1 998G 828G 170G 83% /mnt/hdgpfs

    If the percentage used on the Hadoop master node is equal to or greater than80%, free up disk space on it. For more information, seeFreeing up space onthe Hadoop shared disk on page 28.

    5. Log in to the data node as thecciusruser.

    6. Type the following command to check the disk space usage:

    df -h

    7. In the output from the command, look for the line that contains the percentageof the local disk file system that is used.

    In the following example output, the line that contains/local is the percentageof the local disk file system that is used:Simple table that shows the Filesystemand its related attributes

    Filesystem Size Used Avail Use% Mounted on

    /dev/sda3 79G 5.5G 70G 8% /udev 16G 164K 16G 1% /dev/dev/sda1 190M 30M 152M 17% /boot

    Copyright IBM Corp. 2010, 2014 27

  • 5/21/2018 ag_ci

    34/56

    Filesystem Size Used Avail Use% Mounted on

    /dev/sda4 1.6T 862G 599G 60% /local

    If the percentage used on the data node is equal to or greater than 80%, free updisk space on it. For more information, see Freeing up disk space on the datanode server.

    Freeing up space on the Hadoop shared disk

    You can free up disk space on the Hadoop master and slave nodes.

    Procedure

    1. Log in to the Hadoop master node as thehadoop user.

    2. Find the location of the shared file system by looking in the$FLWMGR_HOME/toroBackend/flowmanager/scripts/FlowMgr.propertiesfile. Thevalue of the CLUSTER_FILESYSTEM_PATH entry is the location of the shared filesystem.

    3. Ensure that there are no jobs that are running.

    4. Type the following commands to remove failed job information:

    cd /location_of_the_shared_file_system/cluster/prod/permanentDirs/failedJobsrm -rf failedJob*

    Where location_of_the_shared_file_systemis the value ofCLUSTER_FILESYSTEM_PATHthat you found in step 2. For example,

    cd /mnt/hdgpfs/cluster/prod/permanentDirs/failedJobsrm -rf failedJob*

    5. Type the following commands to remove temporary evolving topics data:

    cd /location_of_the_shared_file_system/cluster/prod/permanentDirs/topicEvTemprm -rf *

    Where location_of_the_shared_file_systemis the value ofCLUSTER_FILESYSTEM_PATHthat you found in step 2. For example,

    cd /mnt/hdgpfs/cluster/prod/permanentDirs/topicEvTemprm -rf *

    Freeing up disk space on the data node server

    You can free up disk space on the data node server by performing a clean upoperation on a number of different directories.

    Procedure

    1. On the data node server, log on as thecciusr user.

    2. Change to the staging directory by typing the following command:

    cd /staging

    3. Perform the following actions for each project.a. Go to theadhoc directory by typing the following command:

    cd /adhoc

    b. Delete all directories, except for the most recent directory, by typing thefollowing command:

    rm -rf $(ls -t | tail -n +2)

    c. Go to thescheduled directory by typing the following command:

    cd ../scheduled

    d. Delete all directories, except the most recent directory by typing thefollowing command:

    28 IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    35/56

    rm -rf $(ls -t | tail -n +2)

    4. You can clean up log files from the service directory.

    a. To clean up thedataloader log files, go to thedataloaderdirectory, bytyping the following command:

    cd /Dataloader

    b. Type the following command to remove all log files older than ten days:

    find . -name "dataloader.log*" -mtime +10 -exec rm {} \;

    Chapter 6. Disk space management 29

  • 5/21/2018 ag_ci

    36/56

    30 IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    37/56

    Chapter 7. BoardReader license management

    To be able to retrieve data from BoardReader, IBM Social Media Analytics must beupdated with any BoardReader license changes.

    Updating the BoardReader license key

    Update your BoardReader license key information in IBM Social Media Analyticsany time the BoardReader license key information changes.

    Before you begin

    You must have the following items:

    v The BoardReader license key that you received from BoardReader as part ofyour license agreement.

    v Access to a computer whose IP address or domain is registered with

    BoardReader for your BoardReader license key. This information is in yourBoardReader license-related communication.

    Ensure that the IP address or domain of the user interface nodes and the HadoopMaster nodes are registered with BoardReader.

    About this task

    When you update the BoardReader license key information in Social MediaAnalytics according to the following procedure, the license key information isupdated in thecci_topology.xml file on the user interface node and thecrawler.propertiesfile on the Hadoop master node.

    Procedure1. Log on to the user interface node, as thecciusruser.

    2. Go to thesma_location/cci_installmgr/cci_mngmt/cci_cli/directory.

    3. Type the following command:

    ./cci_cli.sh -process changeSetting resourceId boardreader-key-1propertyName boardreader.key propertyValue valid_BR_key

    4. Restart all services by typing the following command:

    ./cci_cli.sh -process restart

    Copyright IBM Corp. 2010, 2014 31

  • 5/21/2018 ag_ci

    38/56

    32 IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    39/56

    Chapter 8. The Data Fetcher Development Kit

    The Data Fetcher Development Kit allows a software developer to integrate a datasource that is not currently available through the standard IBM Social MediaAnalytics application. A software developer does this by coding, testing, andpublishing a custom data fetcher to retrieve data from the data source.

    The Data Fetcher Development Kit consists of the following parts:

    v A specification that defines how to implement the data fetcher so that it can beintegrated in to Social Media Analytics. The specification includes theparameters that the data fetcher requires. The parameters include the outputdirectory for the search results, the mode that the data fetcher runs in, start andend dates, the queries that the data fetcher will run, and other optionalparameters.

    v The format and definition of the results to be generated by the data fetcher. Theresults must be returned in a JavaScript Object Notification (JSON) files in a

    specific format.v Tools to test and validate the data fetcher.

    v Reference implementations that you can use as examples when developing yourdata fetcher. These implementations follow the specifications defined in the DataFetcher Development Kit.

    Detailed information about the Data Fetcher Development Kit, including thespecifications and reference implementations, is available on the IBM SupportPortal (https://www.ibm.com/support/docview.wss?uid=swg27036635). You canfind the latest information about the Data Fetcher Development Kit at this site.

    Writing queries for your data fetcher

    If you want to run jobs that analyze data from your data fetcher and theBoardReader search engine at the same time, make sure that your queries work for

    both the data source of your data fetcher and the BoardReader search engine.

    If you intend to analyze only content from your data fetcher, ensure that your datafetcher returns a unique source and not one of the Social Media Analytics defaultsources: blogs, discussion forums and message boards, Twitter, news sites, reviewsites, and video sites. If your data fetcher does return one of the Social MediaAnalytics default sources, Social Media Analytics will retrieve data for this sourcefrom both BoardReader and your data fetcher, during an ad hoc or scheduled job.

    For more information about writing queries, see the IBM Social Media Analytics User

    Guide.

    Using your data fetcher

    Before you can use your data fetcher, you must do the following things:

    1. Publish the data fetcher. This makes it available for use in Social MediaAnalytics.

    Copyright IBM Corp. 2010, 2014 33

    https://www.ibm.com/support/docview.wss?uid=swg27036635https://www.ibm.com/support/docview.wss?uid=swg27036635https://www.ibm.com/support/docview.wss?uid=swg27036635https://www.ibm.com/support/docview.wss?uid=swg27036635
  • 5/21/2018 ag_ci

    40/56

    2. Write and test queries for the data source for your data fetcher. If your datafetcher supports an existing source, it is very important to test your queries toensure that they return the results you expect from both the BoardReadersearch engine and your data source.

    3. Enter queries for your data fetcher in to Social Media Analytics.

    4. Run ad hoc or scheduled jobs for the source supported by your data fetcher.

    If your data fetcher supports a new source, the new source will appear inConfiguration and Analysis in the areas where you can select a source. If yourdata fetcher supports an existing source, you will not see any changes inConfiguration or Analysis. When the job runs, if you have selected a sourcethat is supported by your data fetcher, your data fetcher will run in the datafetcher phase and produce results that go into the analysis phase.

    34 IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    41/56

    Appendix A. Accessibility features

    Accessibility features help users who have a physical disability, such as restrictedmobility or limited vision, to use information technology products successfully.

    Keyboard shortcuts

    IBM Social Media Analytics enables you to use shortcut keys or command keys tonavigate through the user interface. You can use predefined combinations of keysto perform specific functions.

    Social Media Analytics uses the standard Microsoft Windows operating systemnavigation keys. The following table lists the keyboard shortcuts that you can useto navigate in Social Media Analytics.

    Table 9. Keyboard shortcuts

    Navigation item Description Shortcut key

    General Perform default action for anactive command button

    Enter or Spacebar

    General controls Move forward to the nextcontrol at the same level

    Tab

    General controls Move backward to the previouscontrol at the same level

    Shift+Tab

    Check boxes Toggle a check box to select orclear

    Spacebar

    Radio buttons Toggle a radio button to select orclear

    Spacebar

    Drop-down lists Open and display thedrop-down list contents

    Down Arrow

    Drop-down lists Close a drop-down list Escape

    Tree lists Expand a node Right Arrow

    Tree lists Collapse a node Left Arrow

    Scrolling Scroll down Down Arrow or Page Down

    Scrolling Scroll up Up Arrow or Page Up

    IBM and accessibility

    See the IBM Accessibility Center (http://w3.ibm.com/able) for more information

    about the commitment that IBM has to accessibility.

    Copyright IBM Corp. 2010, 2014 35

  • 5/21/2018 ag_ci

    42/56

    36 IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    43/56

    Appendix B. Troubleshooting and support for IBM SocialMedia Analytics

    Using troubleshooting resources can help you resolve common problems withoutassistance. When assistance is required to solve an issue, the troubleshootingresources and checklists can help you collect the information needed to find asolution.

    For a list of known issues, see the Release Notes.

    Troubleshooting checklist for IBM Social Media Analytics

    Troubleshooting is a systematic approach to solving a problem. The goal oftroubleshooting is to determine why something does not work as expected andhow to resolve the problem.

    Review the following checklist to help you or customer support resolve a problem.__ v Apply all known fix packs, service levels, or program temporary fixes (PTF).

    A product fix might be available to resolve your problem.

    __ v Ensure that the configuration is supported.

    ReviewIBM Social Media Analytics Supported Software Environments(www.ibm.com/support/docview.wss?uid=swg27040245).

    Review theproduct requirementsfor IBM Social Media Analytics(http://pic.dhe.ibm.com/infocenter/prodguid/v1r0/clarity/index.jsp).

    __ v Look up error messages by selecting the product from theIBM Support Portal(http://www.ibm.com/support), and then typing the error message code intothe Search supportbox on the right vertical menu bar.

    Error messages give important information to help you identify thecomponent that is causing the problem.

    __ v Check theIBM Support Portal(http://www.ibm.com/support) and search forIBM Social Media Analytics.

    __ v Reproduce the problem to ensure that it is not just a simple error.

    __ v Check the installation directory structure and file permissions.

    The installation location must contain the appropriate file structure and thefile permissions.

    For example, if the product requires write access to log files, ensure that thedirectory has the correct permission.

    __ v Review all relevant documentation, including release notes, technotes, and

    proven practices documentation.Search theIBM knowledge basesto determine whether your problem isknown, has a workaround, or if it is already resolved and documented.

    __ v Review recent changes in your computing environment.

    Sometimes installing new software might cause compatibility issues.

    If the items on the checklist did not guide you to a resolution, you might have tocollect diagnostic data. This data is necessary for an IBM technical-supportrepresentative to effectively troubleshoot and assist you in resolving the problem.You can also collect diagnostic data and analyze it yourself.

    Copyright IBM Corp. 2010, 2014 37

    http://www.ibm.com/support/docview.wss?uid=swg27040245http://pic.dhe.ibm.com/infocenter/prodguid/v1r0/clarity/index.jsphttp://www.ibm.com/supporthttp://www.ibm.com/supporthttp://www.ibm.com/supporthttp://www.ibm.com/supporthttp://pic.dhe.ibm.com/infocenter/prodguid/v1r0/clarity/index.jsphttp://www.ibm.com/support/docview.wss?uid=swg27040245
  • 5/21/2018 ag_ci

    44/56

    Troubleshooting resources for IBM Social Media Analytics

    Troubleshooting resources are sources of information that can help you resolve aproblem that you are having with a product.

    Support Portal

    The IBM Support Portal is a unified, centralized view of all technical support toolsand information for all IBM systems, software, and services.

    The IBM Support Portal lets you access all the IBM support resources from oneplace. You can tailor the pages to focus on the information and resources that youneed for problem prevention and faster problem resolution. Familiarize yourselfwith the IBM Support Portal by viewing the demo videos (https://www.ibm.com/

    blogs/SPNA/entry/the_ibm_support_portal_videos).

    Find the content that you need by selecting your products from the IBM SupportPortal (http://www.ibm.com/support).

    Searching and navigating for IBM Social Media Analytics

    Access to IBM Social Media Analytics product information can now be configuredin the IBM Support Portal, which provides the ability to see all of your links on asingle page.

    Information gatheringBefore you contact IBM Support, collect diagnostic data (system information,symptoms, log files, traces, and so on) to help resolve the problem. Gathering thisinformation helps familiarize you with the troubleshooting process and saves youtime.

    For more information, seeCollecting logging information on page 39.

    Service requestsService requests are also known as Problem Management Reports (PMRs). Severalmethods exist to submit diagnostic information to IBM Software Technical Support.

    To open a PMR or to exchange information with technical support, view the IBMSoftware Support Exchanging information with Technical Support page(http://www.ibm.com/software/support/exchangeinfo.html). PMRs can also besubmitted directly by usingthe Service requests (PMRs) tool(http://www.ibm.com/support/entry/portal/Open_service_request/Software/Information_Management/Cognos_Business_Intelligence_and_Financial_Performance_Management), or one ofthe other supported methods that are detailed on the exchanging information page.

    Business Analytics Client CenterThe Business Analytics Client Center on ibm.com provides information, updates,and troubleshooting resources.

    To view troubleshooting information, access the Business Analytics Client Center(http://www.ibm.com/software/analytics/support), and view the informationunder "Support".

    38 IBM Social Media Analytics Version 1.3.0: Administration Guide

    https://www.ibm.com/blogs/SPNA/entry/the_ibm_support_portal_videoshttp://www.ibm.com/supporthttp://www.ibm.com/supporthttp://www.ibm.com/software/support/exchangeinfo.htmlhttp://www.ibm.com/software/support/exchangeinfo.htmlhttp://www.ibm.com/support/entry/portal/Open_service_request/Software/Information_Management/Cognos_Business_Intelligence_and_Financial_Performance_Managementhttp://www.ibm.com/software/analytics/supporthttp://www.ibm.com/software/analytics/supporthttp://www.ibm.com/support/entry/portal/Open_service_request/Software/Information_Management/Cognos_Business_Intelligence_and_Financial_Performance_Managementhttp://www.ibm.com/software/support/exchangeinfo.htmlhttp://www.ibm.com/software/support/exchangeinfo.htmlhttp://www.ibm.com/supporthttp://www.ibm.com/supporthttps://www.ibm.com/blogs/SPNA/entry/the_ibm_support_portal_videos
  • 5/21/2018 ag_ci

    45/56

    Fix CentralFix Central provides fixes and updates for your system's software, hardware, andoperating system.

    Use the pull-down menu to navigate to your product fixes on Fix Central(http://www.ibm.com/support/fixcentral). You might also want to view Getting

    started with Fix Central (http://www.ibm.com/systems/support/fixes/en/fixcentral/help/getstarted.html).

    Knowledge basesYou can find solutions to problems by searching IBM knowledge bases.

    You can use the IBM masthead search by typing your search string into the Searchfield at the top of any ibm.com page.

    IBM Knowledge CenterIBM Knowledge Center includes documentation for each release. Thisdocumentation is also available through product help menus.

    IBM Social Media Analytics version 1.3 documentation is available in IBMKnowledge Center (http://www.ibm.com/support/knowledgecenter/SSJHE9_1.3.0/com.ibm.swg.ba.cognos.sma.doc/welcome.html).

    To find links to the latest known problems and APARs, access the Release Notes inKnowledge Center.

    Collecting logging information

    To help customer support diagnose error messages, run the collectLogs process tocapture logging information for all IBM Social Media Analytics components on allnodes to a compressed file. For example, you can use this information to determine

    logging, version, and topology information.

    Before you begin

    You can choose the level of detail that is logged by Social Media Analytics.Detailed log information is often helpful as you diagnose problems with Webapplications or services, or problems with running jobs when auto-generatedqueries are used at the concept level. However, the finer the level of detail, thegreater the performance impact on your application.

    To change the level of detail that is logged, log on to the IBM WebSphereIntegrated Solutions Console (http://:9060/ibm/console) asthecciusr user. Go to Troubleshooting> Logs and trace. Select sma1and choose

    Change log detail levels. Click the Runtime tab and optionally click the Saveruntime changes to configuration as well check box. Expand [All Components]and click com.ibm.sma.*> Message and Trace Levels> fine. Click Apply andthen click OK. Under Logging and tracing, click Save.

    Procedure

    1. Log on to the user interface node server as thecciusruser.

    2. Go to the/cci_installmgr/cci_mngmt/cci_clidirectory.

    3. Collect logging information by typing the following command:

    ./cci_cli.sh -process collectLogs [historydays number_of_days]

    Appendix B. Troubleshooting and support for IBM Social Media Analytics 39

    http://www.ibm.com/support/fixcentral/http://www.ibm.com/systems/support/fixes/en/fixcentral/help/getstarted.htmlhttp://www.ibm.com/systems/support/fixes/en/fixcentral/help/getstarted.htmlhttp://www.ibm.com/support/knowledgecenter/SSJHE9_1.3.0/com.ibm.swg.ba.cognos.sma.doc/welcome.htmlhttp://www.ibm.com/support/knowledgecenter/SSJHE9_1.3.0/com.ibm.swg.ba.cognos.sma.doc/welcome.htmlhttp://www.ibm.com/support/knowledgecenter/SSJHE9_1.3.0/com.ibm.swg.ba.cognos.sma.doc/welcome.htmlhttp://www.ibm.com/support/knowledgecenter/SSJHE9_1.3.0/com.ibm.swg.ba.cognos.sma.doc/welcome.htmlhttp://www.ibm.com/systems/support/fixes/en/fixcentral/help/getstarted.htmlhttp://www.ibm.com/systems/support/fixes/en/fixcentral/help/getstarted.htmlhttp://www.ibm.com/support/fixcentral/
  • 5/21/2018 ag_ci

    46/56

    The historydays parameter is the number of days of logging and diagnostichistory you would like to capture. The default value is 15 days.

    4. From the generated output, record the name of the compressed file, thelocation, and time of creation.

    5. Send thesma_collector_timestamp.zipfile to customer support for review.

    Jobs fail with error no disk space is availableA job fails and the following error log entry occurs in the FlowManager log: nodisk space is available.

    IBM Social Media Analytics uses large amounts of disk space to store interimanalysis results. Over time, these files accumulate and fill the disk.

    To resolve this issue, free up disk space. For information about how to free up diskspace, seeChapter 6, Disk space management, on page 27.

    Minimizing out of memory errors when running multiple projects

    simultaneouslyAdministrators might see out of memory errors when they try to run multipleprojects simultaneously.

    If you see these out of memory errors, you can change the io.file.buffer.sizeparameter in thecore-site.xmlfile on each Hadoop computer.

    Changing the buffer size for read and write operationsIf you see out of memory errors, you can change the size of the buffer for use insequence files. The size of this buffer determines how much data is bufferedduring read and write operations.

    Procedure1. Log on the Hadoop master node server as thehadoop user.

    2. Go to the/bindirectory and stop the services:

    ./stop.sh hadoop

    3. Go to the/hdm/hadoop-conf-stagingdirectory.

    4. Open the core-site.xml file in an editor.

    a. Decrease the value of the io.file.buffer.size property.

    For example, decrease it to 32768.

    b. Save and then close the file.

    5. Go to the/bindirectory and sync the configuration

    across the Hadoop nodes:./syncconf.sh hadoop

    6. Start the services by typing the following command:

    ./start.sh hadoop

    40 IBM Social Media Analytics Version 1.3.0: Administration Guide

  • 5/21/2018 ag_ci

    47/56

    Minimizing out of memory errors when fetching a large number of

    documents

    To minimize memory errors when fetching a large number of documents, you canchange some Flow Manager property file settings. You can also increase theamount of memory allocated for the data fetching process.

    When memory errors are generated, they are listed in the flowmgr.log file locatedin the directory. The default location is the/home/hadoop/FlowMgrdirectory.

    Procedure

    1. Log on to the Hadoop master node as thehadoop user.

    2. Go to the/toroBackend/crawler/confdirectory.

    3. Open thecrawler.properties file in an editor.

    a. Set the max.running.split.count property to a lower value. For example,change it from 15 to 10.

    b. Lower the thread.slice.count property value. For example, change it to

    200.4. Optional: You can specify the amount of memory allocated for retrieving data.

    a. Go to the/toroBackend/flowmanager/scripts/directory.

    b. Make a backup copy of theflowmanager.shfile.

    c. Open theflowmanager.shfile in an editor.

    d. Locate the following line.

    $JAVA ${remotedebug} ${healthcenter} -Xmx512m -D-cp $CLASSPATH $MAIN_CLASS $STATE_STORE_DIR

    e. Add the crawler.max.heapsize property value to this line between -D and-cp, as shown in the following example.

    $JAVA ${remotedebug} ${healthcenter} -Xmx512m -Dcrawler.max.heapsize=-cp $CLASSPATH $MAIN_CLASS $STATE_STORE_DIR

    f. For the variable, type in the amount of memory in megabytes. Thedefault value is 1024 megabytes.

    Increasing the heap size for WebSphere Embedded Application Server

    in large deployments

    The initial and maximum heap size values of the Java virtual machine that isused by IBM WebSphere Embedded Application Server is specified in the topologyfile. To improve performance in larger deployments, you might want to change

    these values.

    Before you begin

    Ensure that you have the following information:

    v WebSphere Embedded Application Server profile folder name.

    The default location is/local/ibm/cognos/ci/coninsight/was_sma.

    v WebSphere Embedded Application Server name.

    The default server name is sma1.

    v WebSphere Embedded Application Server admin user name.

    Appendix B. Troubleshooting and support for IBM Social Media Analytics 41

  • 5/21/2018 ag_ci

    48/56

    The default admin user name iscciusr.

    v WebSphere Embedded Application Server password.

    The password is set when you generate the topology.

    Procedure

    1. To change the heap size for IBM WebSphere Embedded Application Server, on

    the data node and user interface node servers, log on as the cciusr user.a. Go to the/was_sma/scriptsdirectory.

    b. Type the following command:

    ../ewas/profiles/sma1Profile/bin/wsadmin.sh -user -password -lang jython -f set_jvm_mem.py

    For example,

    ../ewas/profiles/sma1Profile/bin/wsadmin.sh -user cciusr -password -lang jython -f set_jvm_mem.py sma1 512 1024

    2. Restart the WebSphere Embedded Application Server:

    ../ewas/profiles/sma1Profile/bin/stopServer.sh sma1 -username -password

    ../ewas/profiles/sma1Profile/bin/startServer.sh sma1

    Temporary directory not found errors when IBM Hadoop master node

    mounts external NFS server

    If you deploy IBM Social Media Analytics by using an external network file system(NFS) server to store the shared directory, and the IBM Hadoop Master computermounts the remote NFS server shared drive, jobs might stop running for severalhours and you might see intermittent _temporary directory not found errors.

    The Flow Manager process runs the query language for the JavaScript ObjectNotation (JAQL) process and uses the shared working directory on the NFS server

    to package the Java archive (JAR) file before sending the file to Hadoop. If thecomputer that runs the JAQL process uses a remote mount for the NFS servershared drive, the I/O operation might take a long time because of the largenumber of files that must be compressed over the network. Other I/O operationsmight also be affected.

    To prevent this issue, try one of