《人工智能英语入门》

2.3 Big Data Thinking and Automatic Fuzzy Logic

2.3.1 Anti-Accuracy Worship—the Bald Man Paradox

The Bald Man Paradox usually refers to the most clear-cut of the paradoxes of vagueness，the sorites paradox. Or，strictly speaking，it refers to one of the dramatizations of this paradox. This case is nevertheless fully representative of the general issues involved. The allegedly paradoxical argument is well known. It might be formulated as follows：

（P.1）Premise one：a man with no hairs is bald.

（P.2）Premise two：If a man with n hairs is bald，then so is a man with n ＋ 1 hairs.

（Colloquially speaking，adding a single hair will not rescue him from baldness！）

（C）Conclusion：No matter how many hairs a man has，he is bald.

The paradox lies in the fact that the premises（P. 1）and（P. 2）seem to be unproblematically true but the conclusion（C）false.

If the reasoning involved in the paradox seems so unproblematic，how can the paradox be explained away？ What can the bald man tell us？ Here comes the first and quite possibly foremost strategic moral that can be elicited from discussions of the different paradoxes，such as the sorites paradox and the paradox of the liar. An enormous amount of inspiration and perspiration has been devoted to sundry “solutions”of the paradoxes. It is nevertheless by this time eminently clear that the only way of clearing up such major paradoxes is a deeper analysis of the basic concepts figuring in them. To discuss the paradoxes as separate puzzles without digging deeper into their sources in logic and semantics is not much more instructive than to solve crossword puzzles. The difficulty of the paradoxes is reflected in the fact that this proposed deeper analysis forces us to take a long hard look at the questions that underlie all logic. In the case of the sorites paradox，the basic concepts that hold the key to understanding the paradox are negation and mathematical induction.

The basic insight that is not heeded in the earlier discussion is thus that if we admit truth-value gaps，we must have in our logic a strong dual negation that does not obey the law of excluded middle. This is perhaps the most important thing that the bald man can tell us. Otherwise，we for instance cannot deal with the totality of truth-value gaps nor with the meaning of any third（indefinite）-truth-value.

Fortunately，there exists a logic which can serve these purposes. This logic is known as Independence-Friendly（ IF ） Logic . It is more fundamental than our ordinary first-order logic，which could be called，somewhat inaccurately historically，the FregeRussell logic. As has been repeatedly pointed out，Frege-Russell logic is unnecessarily restricted in its expressive power. There are several crucial logical and mathematical ideas that cannot be expressed in ordinary first-order logic but can be expressed by means of IF logic，including equicardinality，infinity and topological continuity. Now we can add to the services of IF logic which can perform for the solution of sorites paradox.

2.3.2 Core of Fuzzy Logic—Fuzzy Thinking

How can we represent expert knowledge that uses vague and ambiguous terms in a computer？

Fuzzy Logic is not logic that is fuzzy，but logic that is used to describe fuzziness.Fuzzy Logic is the theory of fuzzy sets，sets that calibrate vagueness. Fuzzy Logic is based on the idea that all things admit of degrees. Temperature，height，speed，distance，beauty-all come on a sliding scale. For instance：

· The motor is running really hot.

· Tom is a very tall guy.

Boolean Logic uses sharp distinctions（Fig. 2.2）. It forces us to draw lines between members of a class and nonmembers. For instance，we may say，Tom is tall because his height is 181 cm. If we drew a line at 180 cm，we would find that David，who is 179 cm，is small. Is David really a small man or have we just drawn an arbitrary line in the sand？

Fig. 2.2 Boolean Logic and Multi-Valued Logic

Fuzzy，or multi-valued logic（Fig. 2.2），was introduced by Lukasiewicz，who introduced logic that extended the range of truth values to all real numbers in the interval between 0 and 1. For example，the possibility that a man 181 cm tall is really tall might be set to a value of 0.86.

It is likely that the man is tall with possibility theory.

In 1965 Zadeh in his paper proposed “fuzzy sets”.

Why fuzzy？

As Zadeh said，the term is concrete，immediate and descriptive；we all know what it means.

Why logic？

Fuzziness rests on fuzzy set theory，and Fuzzy Logic is just a small part of that theory.

The term Fuzzy Logic is used in two senses：

In narrow sense，Fuzzy Logic is a branch of fuzzy set theory，which deals（as logical systems do）with the representation and inference from knowledge. Fuzzy Logic，unlike other logical systems，deals with imprecise or uncertain knowledge.

In broad sense，Fuzzy Logic is synonymously with fuzzy set theory.

Fuzzy Logic is a set of mathematical principles for knowledge representation based on degrees of membership. Fuzzy Logic uses the continuum of logical values between 0（completely false）and 1（completely true）.

However，our own language is also the supreme expression of sets. For example，car indicates the set of cars. When we say a car，we mean one out of the set of cars.

The classical example for fuzzy sets is tall men（Fig. 2.3）. The elements of the fuzzy set “tall men” are all men，but their degrees of membership depend on their height.

Fig. 2.3 Degree of Membership for Tall Men

The x-axis represents the universe of discourse-the range of all possible values applicable to a chosen variable. The y-axis represents the membership value of the fuzzy set，as is shown in Fig. 2.4.

The universe of discourse—men’ s heights—consists of three sets：short，average and tall men. As you will see in Fig. 2.5，a man who is 184 cm tall is a member of the average men set with a degree of membership of 0.1，and at the same time，he is also a member of the tall men set with a degree of 0.4. First，we determine the membership functions. In our “ tall men” example，we can obtain fuzzy sets of tall，short and average men.

At the root of fuzzy set theory lies the idea of linguistic variables.

A linguistic variable is a fuzzy variable. For example，the statement “ John is tall”implies that the linguistic variable John takes is the linguistic value tall.

Fig. 2.4 Tall Men in Crisp and Fuzzy Sets

Fig. 2.5 Men's Heights in Three Sets

In fuzzy Expert Systems，linguistic variables are used in fuzzy rules. For example：

IF wind is strong，

THEN sailing is good.

IF project duration is long，

THEN completion risk is high.

IF speed is slow，

THEN stopping distance is short.

The range of possible values of a linguistic variable represents the universe of discourse of that variable. For example，the universe of discourse of the linguistic variable speed might have the range between 0 and 220 km／ h and may include such fuzzy subsets as very slow，slow，medium，fast，and very fast.

Concept of fuzzy set qualifiers，called Hedges. Hedges are terms that modify the shape of fuzzy sets. They include adverbs such as very，somewhat，quite，more or less and slightly.

Operations of Fuzzy Set is shown in Fig. 2.6.

Fig. 2.6 Operations of Fuzzy Set

A fuzzy rule can be defined as a conditional statement in the form：

IF x is A ，

THEN y is B.

where x and y are linguistic variables；and A and B are linguistic values determined by fuzzy sets on the universe of discourses X and Y，respectively.

The last step in the fuzzy inference process is defuzzification . Fuzziness helps us to evaluate the rules，but the final output of a fuzzy system has to be a crisp number. The input for the defuzzification process is the aggregate output of fuzzy set and the output is a single number.

There are several defuzzification methods，but probably the most popular one is the centroid technique. It finds the point where a vertical line would slice the aggregate set into two equal masses.

As shown in Fig. 2.7，centroid defuzzification method finds a point representing the centre of gravity of the fuzzy set，A，on the interval ab. A reasonable estimate can be obtained by calculating it over a sample of points.

Fig. 2.7 Centroid Defuzzification Method

2.3.3 Big Data and the Realization of Automatic Fuzzy Logic—Turning Intelligence Problem into Data Problem

Why do companies struggle with AI？ Some people believe that rapid waves of technology has caused many companies to be a little clueless about what they should focus on. The move from data centers to Cloud，from web to mobile web to native apps，and from Big Data to AI，has not made it easy for enterprise companies. Especially the discussions between the CFO and the CTO，who on one side wants to reduce costs and on the other side wants to have the best possible technology at his ／ her disposal， can lead to indecision，which leads to no decision.

Another reason why some people believe companies struggle with AI is that the lack of evidence that AI can have an impact，leads a lot of people to believe that it’ s just a hype and will eventually go away. Many industries，including Financial Services，Transportation and Insurance have used data and computerized decision making to have an impact on their business，but many other industries who don’ t have the same level of data are harder to convince that AI can have an impact on their business.

Bridging the Qualitative-to-Quantitative Gap in Data Science is shown in Fig. 2.8.

Fig. 2.8 Data Driven Team for Qualitative-to-Quantitative Gap

Consider the following mini-scenarios：

During a regular weekday lunch，as you are discussing how everybody’ s weekend was，one of your colleagues mentions she watched a particular movie that you have also been wanting to watch. To know her feedback on the movie，you ask her—“Hey，was the movie's direction up to the mark ？”

You bump into a colleague in the hallway who you haven’ t seen for a couple of weeks. She mentions she just returned from a popular international destination vacation.To find more about the destination，you ask her—“Wow！ Is it really as exotic as they show in the magazines？”

Your roommate got a new video game that he has been playing nonstop for a few hours. When he takes a break，you ask him—“Is the game really that cool？”

Did you find any of these questions “artificial”？ Do re-read the scenarios and take a few seconds to think through. Most of us would find these questions to be perfectly natural！

What would certainly be artificial though is asking questions like：

“Hey，was the movie direction 3.5-out-of-5？”，or

“Is the vacation destination 8 on a scale of 1-to-10？”，or

“Is the video game in the top 10 percentile of all the video games？”

In most scenarios，we express our asks in qualitative terms. This is true for business requirements as well.

Isn't it more likely that the initial client ask will be “build us a landing page which is aesthetically pleasing yet informative” versus “we need a landing page which is rated at least 8.5-out-of-10 by 1000 random visitors to our website on visual-appeal，navigability and product-information parameters”？

On the other hand，systems are built and evaluated based on exact quantitative requirements. For example，the database query has to return in less than 30 milliseconds，the website has to fully load in less than 3 milliseconds on a typical 10mbps connection，and so on.

This gap between qualitative business requirements and quantitative machine requirements is exacerbated when it comes to data-driven products.

A typical business requirement for a data-driven product could be “ develop an optimal digital marketing strategy to reach the likely target customer population ”.Converting this to a quantifiable requirement has several non-trivial challenges. Some of these are：

How we define “optimal” ：Do we focus more on precision or more on recall？ Do we focus more on accuracy（is the approached customer segment really our target customer segment or not）？ Or do we focus more on efficiency（how quickly do we make a go ／ no-go decision once the customer segment is exposed to our algorithm）？

How do we actually evaluate if we have met the optimal criteria？ And if not，how much of a gap exists？

To define customers “similar” to our target population，we need to agree on a set of N dimensions that will be used for computing this similarity：

Patterns in the browsing history；

Patterns in e-shopping；

Patterns in user-provided meta-data，and so on. Or do we need to device a few other dimensions？

After that，we need to critically evaluate whether all the relevant data exists in an accessible format. If not，are there ways to infer at least parts of it？

2.3.4 Is Big Data Making Automatic Fuzzy Logic More Accurate or Fuzzier?

We’ re not that much smarter than we used to be，even though we have much more information—and that means the real skill now is learning how to pick out the useful information from all this noise.

—Nate Silver

The sort of Big Data that analysts and social scientists frequently use in their research can be described as “ found data”，which is composed of observational data collected from website traffic，sensor data，or any large-scale source of user activity.This data is often labeled as “big” because it can easily contain many millions of records reflecting user behaviors on a website，such as viewing，clicking，downloading，uploading，evaluating，and purchasing of digital resources. In most cases，these data are snapshots of time that are collected from an entire sample of individuals who are active in that particular moment. Examples of this sort of data are website log files or traffic data，social media data dumps，online professional networks（e.g. where teachers learn about jobs and post teaching resources），or even massive open online courses with user interaction and performance（e.g. Coursera ）. As cell phones and wearable devices begin to collect sensor data，Big Data will only get bigger and the problem will only continue to become more amplified and more prevalent.

Ironically，we are taught in statistics classes that “more data is better”. However，the size of today’ s “ big” data has made it cumbersome to work with and prevents most researchers from performing basic data due diligence.

The sheer size of Big Data can also lead analysts to confuse it with the statistical ideal of a “population” whereas，in fact，it is a very biased sample. Since observational data from online sources are not derived from statistically rigorous designed experiments，they often can contain many types of biases.

One form of bias encountered in web data is population bias. Many researchers treat their Big Data as a “census”，meaning that it is a complete and representative collection of the entire population. However，in many cases it can contain a misrepresentative mixture of subpopulations. For example，websites（and consequentially the data from them）are frequently trolled by “robots”，which are computer programs designed to behave like users，but they are just collecting data from web pages. Data collected from social websites，for example，can over-represent certain user groups and include some pages that are not even people at all（fan pages）.

Another major form of bias encountered in web data is referred to as “activity bias”.This arises because the data collected is skewed to the users who are active on the website during the measurement period. Since most users show up once to a website and never return（at least not in the observed window of time），the conclusions drawn from the data are specific only to the sample of people active at that time. The size of the data，and the fact that they are often analyzing the whole database，leads a researcher to erroneously believe that they have an entire，statistically representative population. The hidden biases in Big Data have the dangerous potential of causing a researcher to draw conclusions that are far from the truth.

The use of found data and standard statistical procedures that rely on typical statistical assumptions tends to produce a large number of statistically significant effects for what really may be a non-representative subpopulation. All too often these sorts of analyzes will afford precisely inaccurate results.

To make matters worse，a data set is often victim to more than one type of error.Some examples of how errors can arise：

Outdated or incomplete information may persist due to the cost and／ or effort of obtaining up-to-date information.

An organization that uses multiple data sources may incorrectly interweave data sets and／ or be unaware of causal relationships between data points and lack proper data governance mechanisms to identify these inconsistencies.

An organization may fall prey to data collection errors：

Using biased sample populations（subject to sampling biases based on convenience，self-selection，and／ or opt-out options，for instance）.

Asking leading or evaluative questions that increase the likelihood of demand effects（for example，respondents providing what they believe to be the “ desired” or socially acceptable answer versus their true opinion，feeling，belief，or behavior）.

Collecting data in suboptimal settings can also lead to demand effects（for example，exit polls，public surveys，or any mechanism or environment in which respondents do not feel their responses will be truly anonymous）.

Relying on self-reported data versus observed（actual）behaviors.

Data analysis errors may lead to inaccuracies due to：

Incorrect inferences about consumers ’ interests（for example，inferring that the purchase of a hang-gliding magazine suggests a risky lifestyle when the purchaser's true motive is an interest in photography）.

Incorrect models（for instance，incorrect assumptions，proxies，or presuming a causal relationship where none exists）.

Malicious parties may corrupt data（for example，cybercrime activity that alters data and documents）.

There is growing recognition that much Big Data is built on inaccurate information，driving incorrect，suboptimal，or disadvantageous actions. Some initial efforts are under way to put in place regulations around Big Data governance and management.

Fuzzy Logic is a “degrees of truth” approach rather than “true or false”（1 or 0）. The idea of Fuzzy Logic was first introduced by Dr. Zadeh from University of California in the 1960s. Fuzzy Logic gives decisions as same as that of human perception and reasoning.Fuzzy Logic has been proved to be applied well in expertise systems. The creation of fuzzy sets helps in determining whether the element belongs to the set or not，ranging from 0 to 1. It is used for making decisions in terms of uncertainty.

The paradigm of learning or understanding past experiences and with that data，trying to improve future performance is known as Machine Learning. Machine Learning provides alternative solutions from the large amount of vast data by developing algorithms for processing real time data and gives accurate results and analysis. The Machine Learning aims on the computer program development that can access the data and they can learn on their own.

Machine Learning algorithms aim at extracting knowledge from large amounts of data and offer traditional methods for classification and clustering processes. It handles a variety of data and they can be used in large environments. Moreover，learning time is needed for the algorithms to perform the progress with accuracy and relevance. Fuzzy Logic measures the certainty of the problem，and the algorithms are robust and they adapt easily to changing environments.

The Machine Learning techniques for decision making produce good results by handling large data environments and give good ideas to the experts in different fields to improve the future enhancements in the fields they are involved in. Fuzzy Logic also equally helps in finding out the uncertainties in the problem and they adapt themselves to the changing environments and also help in decision making.