br In the case of decreased growth
In the case of ‘decreased growth’ or ‘reducing fever’ we would ex-pect the system to capture growth and fever, but the task of associating a change (or a negated change) is given to a higher level annotation scheme, such as SemRep that captures semantic relationships from the literature [24,31] or the Claim framework  where such relationships would be captured as explicit, implicit, or comparison claim or as an association. Several of the outcome examples in  conform perfectly to the comparison claim type where automated approaches have to identify the entity and endpoints (which were also defined as noun phrases) within a comparison sentence [7,25]. Although all re-presentational choices have trade-offs, the noun phrase unit of analysis appears to better align with actual outcome expressions used by experts Journal of Biomedical Informatics: X 1 (2019) 100005
and reported in the STEEP, DATECAN and REMARK standards de-scribed in Section 2.1.
With respect to scope there has been some discussion concerning the difference between defining an outcomes as a specific measure versus defining an outcome as the overall conclusion of a randomized clinical trial. We concur with the position that “clinical outcomes must be carefully distinguished in the text from the outcomes of clinical trials themselves” [27, p. 79], but also recognize that this Sotrastaurin distinction can be difficult to discern. For example the Medical Subject Heading controlled vocabulary defines “Treatment Outcome” as “Evaluation undertaken to assess the results or consequences of management and procedures used in combating disease in order to determine the efficacy, effectiveness, safety, and practicability of these interventions in individual cases or series.”2 Maintaining the distinction between a measure and the overall trial result is important however, because the GRADE guidelines state that the benefit of an intervention is ultimately established by a pa-tient’s preferences concerning the desirable and non-desirable out-comes.
As with prior work we frame the task of outcome extraction as a classification activity where both rule based [12,20] and learning techniques [12,21] have been explored. The features used in our ma-chine learning experiments include stemmed words that appear before and after the outcome and are also informed by lists of terms, which differ from prior work that employs cue phrases [12,9] the location of a sentence [12,21], bag of words, concept unique identifiers and head-ings . We also do not require manual pruning by registered nurse to remove “topic-specific terms” nor do we combine models using “ad hoc weights selected based on our intuitions about the prediction of the base classifier” or the least squares linear regression to select weights optimally that have been used in prior work . We did explore a baseline model that identified the best stemmed words using informa-tion gain to identify outcome sentences and then identified all noun phrases from within the sentence, but stopped pursuing that direction when results were not fruitful. Lastly, a combination of machine learning and weak rules  and regular expressions have been used to identify outcomes along with the semantic types of terms used near the target of interest .
In addition to the machine learning algorithms (decision tree, naïve bayes, support vector machines and the general linear model), we also explored a list model, where all phrases and abbreviations that were in the same list as a seed outcome were deemed new outcomes. Lists have been successfully leveraged to identify ontological relationships , however to our knowledge lists have not been explored to identify outcomes. The list structure also informed feature selection strategies used in the machine learning models and the experiments reported here show the extent to which those strategies impact predictive perfor-mance.