Let us not quarrel about the data?
Virtually everyone who attended linguistic talks at any workshop can recall a situation like this. A paper-giver is presenting some standard or original theory and provides it with illustrations from some language; the audience starts to challenge his data; someone who happens to be a native speaker of this language or is an expert in it claims that example (17) is suspicious, example (19) is ill-formed, while example (5), contrary to the paper-giver, is fine, at least in the idiolect of this speaker. The paper-givers usually hate such remarks, since objections of this kind swallow up the time for discussion; besides, they often don’t know how to respond and are angry that nobody tried to discuss substantial points in their analysis. If they are allowed to verbalize their reaction in the lobby, if is often like that: “I am theory-minded. You guys are narrow-minded and empiric. We do different things”. Many of us could have been in the roles of both conflicting parts.
I am theory-minded and I get angry too, when minor (in my view) details block for discussing my models. But I don’t think that an obstinate set-off of models and linguistic data is productive and that being at odds with data is an advantage of any linguistic model. Nobody seems to defy this banality openly. There is, however, a current trend in linguistics to water down the notion of a ‘norm’ or ‘standard form of a language’ by applying to an escape notion of ‘continuum’ — one of the trendy words for which linguists should be fined (not always, I suggest, but in some situations). I have heard opinions that “There is a continuum of speakers’ judgments from ‘completely fine’ to ‘completely unacceptable’. Therefore, one should not rely on speakers’ intuitions too much, since virtually all probable descriptions will correspond to somebody’s usage”. A position like this is characteristic of some generative linguists and for many corpus linguists: the latter are prone to show us abnormal usages and claim that they are okay, if some crazy mind or careless tongue generated them and some fussy student tagged them in the corpus. I think it is a dangerous position incompatible with a generative outlook. If Grammar really exists and is not just a fancy of mathematicians and formal linguists, then a generative capacity (I am not specifying here, whether it is weak or strong generative capacity) is characteristic of all speakers, irrespective of the fact, how codified their usage is. It is absurd to claim that only a codified standard and its pressure make the speakers agree on a shared number of relevant grammatical constraints and that oral and young written languages (see the page ORAL vs WRITTEN on this site) lack grammar – how could they develop their parametric settings then? To derive grammatical constraints from field notes or texts on a dead language is a difficult task, but it is in any event a technical problem, which can be solved successfully, if a field linguist / historic linguist is a craftsman and his/hers collection of data is sufficient.
I am not arguing against high levels of abstraction or against an approach, when one at first puts forward some promising model and then thinks out, to which layer of data it corresponds. Discrepancy with linguistic data is not fatal for a model as such, but it gives a stimulus to develop an alternative model or to verify the existing one in a different axiomatic system; all this has double effect with regards to deductively close theories, since such theories cannot be proved within the same formalism – see the post ‘Word Order Calculus’ on the page DIALOGUE 2008. I am arguing here against ignoring well-formed expressions: don’t tell us that they don’t exist in language L, if your model fails to explain their existence. Don’t claim that the boundary between the well-formed and the ill-formed doesn’t exist, if your theory fails to tell, where it lies. Such claims amount to a breakdown of subtle linguistic analysis. For many decades both the structural framework and the generative framework have been producing fine tools for the explanation of virtually all known constraints, categorial distinctions, subcategorization frames and lexical collocations. Certainly, different people are able to use these fine tools to a different extent, but that should remain their own problem. If I cannot imitate Chinese tones, predict the uses of the Slavic imperfective aspect, I probably should not be blamed for that, unless I start theoretizing that the distribution of Chinese tones and Slavic aspect forms is chaotic and unpredictable. If you are a bit deaf and mix up a raising accent with a falling accent or a bit absent-minded and forget that, say, certain category of language L is allowed in vPs in n+2 positions, while you listed only n positions, this is not a tragedy. But please, don’t tell us that every theory that does this work,is on the wrong track, since there is a ‘continuum of judgments about the acceptability of rising tone on that or that syllable’.